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BIALIEUC MARKERS FOR USE IN CONSTRUCTING A HIGH DENSITY DISEQUILIBRIUM MAP OF THE HUMAN GENOME 



Background of the Invention 
Recent advances in genetic engineering and biomformatics have enabled the manipulation and characterization 
of large portions of the human genome. While efforts to obtain the full sequence of the human genome are rapidly 
progressing, there are many practical uses for genetic information which can he implemented witli partial knowledge of 
tho sequence of the human genome. 

As the full sequence of the human genome is assembled, the partial sequence information available can be used 
to identify genes responsible for detectable human traits, such as genes associated with human diseases, and to develop 
diagnostic tests capablo of identifying individuals who express a detectable trait as the result of a specific genotype or 
imfivirfuals whose genotype places thorn at risk of developing a detectable trait at a subsequent time. Each of these 
applications for partial genomic sequence information is based upon the assembly of genetic and physical maps which 
order the known genomic sequences along the human chromosomes. 

The present invention relates to human genomic sequences which can be used to construct a high resolution 
map of the human genome, methods for constructing such a map, methods of identifying genes associated with 
detectable human traits, and diagnostics for identifying individuals who carry a gene which causes them to express a 
detectable trait or which places them et risk of expressing a detectable trait in the future. 

Summary of the Invention 

A first embodiment of the present invention is a method of obtaining a set of bialleGc markers comprising the 
steps of obtaining a nucleic acid library comprising a plurality of genomic DNA fragments comprising the full genome or a 
portion thereof, determining the order of said plurality of genomic DNA fragments in the genome, determining the 
sequence of selected regions of said plurality of genomic DNA fragments, and identifying nucleotides in said plurality of 
genomic DNA fragments which vary between individuals, thereby defining a set of biallelic markers. 

In one aspect of this first embodiment, the identifying step comprises identifying about 20,000 biallelic 
markers, in another aspect of this first embodiment, the identifying step comprises identifying about 40,000 biallelic 
markers. In a further aspect of this embodiment the Identifying step comprises identifying about 60,000 biallelic 
markers. In still another aspect of this first embodiment, the identifying step comprises identifying about 80,000 
biallelic markers. . In still another aspect of this first embodiment the identifying step comprises identifying about 
100.000 biallelic markers- . In still another aspect of this first embodimen the identifying step comprises identifying 
about 120,000 biallelic markers. 

In still another aspect of this first embodiment, the biallelic markers are separated from one another by an 
average distance of 10kh-200 kb. * In still another aspect of this first embodiment, the biallelic markers are separated 
from one another by an average distance of IBkb-150 kb. In still another aspect of this first embodiment the biallelic 
markers are separated from one another by an average distance of 20kb-100 kb- . In still another aspect of this first 
embodiment thetianefic markers are separated from one another by an average distance of 100kb-T5D kb. In still 
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another aspect of this first embodiment the biallelic markers are separated from one another by an average distance of 
50-1 OOkb. . In still another aspect of this first embodiment the bialfelic markers are separated from one another by an 
average distance of 25 kb-50 kb. 

In still another aspoct of this First embodiment, the step of determining the sequence of selected regions of 
said plurality of genomic DNA fragments comprises inserting fragments of said plurality of genomic DNA fragments into 
a vector to generate a plurality of subclones and determining the sequence of a region of the inserts in said plurality of 
subclones or a subsot thereof. For example, in this aspect of the first embodiment, the step of determining the sequence 
of a region of said inserts or a subset thereof may comprise determining the sequence of one or both end regions of said 
inserts or a subset thereof. In this aspect of the first embodiment, the step of determining the sequence of one or both 
end regions of said plurality of subclones comprises determining the sequence of about 500 bases at each end of said 
subclones or a subset thereof. 

In stfll another aspect of this first embodiment a set of about 10,000 to about 20.000 genomic DNA inserts 
with an average size between lOOkb and 300kb are ordered. In still another aspect of this first embodiment, a set of 
about 10,000 to about 30.000 genomic DNA inserts with an overage size between 100kb and 150 kb are ordered. In 
still another aspect of this first embodiment, a set of about 15,000 to about 25,000 genomic DNA inserts with an 
average size between lOOkb and 200 kb are ordered. 

In still another aspect of this first embodiment, the identifying step comprises identifying between 1 and 6 
biallelic markers per genomic DNA fragment. In still another aspect of this first embodiment, the identifying step 
comprises identifying an average of 3 biallelic markers per genomic DNA insert 

In still another aspect of this first embodiment, the genomic DNA fragments are in a Bacterial Artificial 
Chromosome. In still another aspect of this first embotfiment the genomic DNA fragments are in a Ycnst Artificial 
Chromosome. 

In still another aspect of this first embodiment, the method further comprises determining the position of said 
biallefic markers along the genome or a portion thereof. In this aspect of the first embodiment, the step of determining 
the position of said biallefic markers along the genome or portion thereof may comprise determining the position of said 
biallelic markers along a chromosome. In this aspect of the first embodiment, the step of determining the position of 
said biallefic markers along the genome or portion thereof comprises determining the position of said biallelic markers 
along a subchromosomal region. 

In stSI another aspect of this first embodiment the method further comprises identifying biallelic markers 
which are in linkage disequilibrium with one another. In this aspect of the first embodiment the method may further 
comprise optimizing the tntermarker spacing between said biallelic markers such that each identified marker is in linkage 
disequilibrium with at least one other identified marker. 

In still another aspect of this first embodiment the portion of the genome comprises at least 200 kb of 
contiguous genomic DNA. In stfll another aspect of this first embodiment the portion of the genome comprises at least 
300 kb of contiguous genomic DNA. In stQI another aspect of this first embodiment the portion of the genome 
comprises et least T5Dffkb of contiguous genomic DNA. In still another aspect of this first embodiment, the portion of the 
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genome comprises at least 2 Mb of contiguous genomic DNA. In still another aspect of this first embodiment, the portion 
of the genome comprises at least 5 Mb of contiguous genomic DNA. In still another aspect of this first embodiment, the 
portion of the genome comprises at least 10 Mb of contiguous genomic DNA. In still another aspect of this first 
embodiment, the portion of the genome comprises at least 20 Mb of contiguous genomic DNA. 

Iii still another aspect of this first embodiment, the method further comprises the step of identifying one or 
more groups of biallelic markers which are in proximity to one another in (he genome. In this aspect of the first 
embodiment the biaHofic markers hi each of these groups may ha located within a genomic region spanning less than 
1kb. Alternatively, m this aspect of the first embodiment, the biallefic markers in each of these groups may be located 
within a genomic region spanning from 1 to 5kb. Alternatively, in this aspect of tha first embodiment, the biallelic markers 
in each of these groups may be located within a genomic region spanning from 5 to 10kb. Alternatively, in this aspect of 
the first embodiment, the biallelic markers in each of these groups may be located within a genomic region spanning from 
1Q to 25kb. Alternatively, in tins aspect of the first embodiment, the biallelic markers in each of these groups may bo 
located within a genomic region spanning from 25 to 50kb, Alternatively, in this aspect of the first embodiment, tho 
biallelic markers in each of these groups may be located within a genomic region spanning from 50 to 150kb. 
Alternatively, in this aspect of the first embodiment the biallefic markers in each of these groups may be located within a 
genomic region spanning from 150 to 25Qkk Alternatively, in tliis aspect of the first embodiment the biallelic markers in 
each of these groups may be located within a genomic region spanning from 250 to 500kb. Alternatively, in this aspect of 
tho first embodiment tha biallefic markers in each of these groups may be located within a genomic region spanning from 
500kb to 1Mb. Alternatively, in this aspect of the first embodiment the biallefic markers in each of these groups may be 
located within a genomic region spanning more than 1Mb. 

A second embodiment of the present invention is a method of obtaining a set of biallelic markers comprising the 
steps of obtaining a nucleic add library comprising genomic DNA fragments comprising the full genome or a portion 
thereof, determining the sequence of selected regions of said genomic DNA fragments, identifying nucleotides in said 
genomic DMA fragments which vary between individuals, thereby defining a set of biallefic markers, and 
determining the order of said biallefic markers along the genome or portion thereof. 

A third embodiment of the present invention is a set of bialtoEc markers obtained by the method of the first 
embodiment in one aspect of this third embodiment the markers in said set have a known genomic position. In another 
aspect of this third embodiment the markers in said set have a known genomic relationship to one another. 

A fourth embodiment of the present invention is a set of biallefic markers having a known relationship to one 
another and a known genomic position, said set of biallelic markers being obtained by the method of the first 
embodiment In one aspect of this fourth embodiment, the biallefic markers have heterozygosity rates of at least about 
0.1 8. In another aspect of this fourth embodiment the biallefic markers have heterozygosity rate of at least about 0.32. 
In still another aspect of this fourth embodiment the biallelic markers have a heterozygosity rate of at least about 0.42. 

A fifth embodiment of the present invention is a map comprising an ordered array of at least 20,000 biallelic 
markers obtained by the method of tfw first embodrment In one aspect of this fifth embodiment the map comprises an 
ordered array of at least 60,000 biallelic markers ohtained by the method of the first embodiment In another aspect of 
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this fifth embodiment the map comprises an ordered anay of at least 120,000 biallelic markers obtained by the method 
of the first embodiment 

In another aspect of this fifth embodiment biafleiic markers are distributed at en average marker density of 
one marker every 150kb. In a further aspect of Ibis fifth embodiment the brallelic markers are distributed at an average 
marker density of one marker every 50 kb. In a further aspect of this fifth embodiment, the bialletic markers are 
distributed at an average marker density of one marker every 25 kb. 

A sixth embodiment of the present Invention is a method of identifying one or more biallelic markers associated 
with a detectable trait comprising the steps of determiningthe frequencies of eacli allele of one or more biallelic 
markers obtained by the method of ttio first embodiment in individuals who expross said detectable trait arid individuals 
who do not express said detectable trait and identifying one or more alleles of said one or more biallelic markers which 
are statistically associated with the expression of said detectable trait In one aspect of this sixth embodiment, the 
detectable trait is selected from the group consisting of disease, drug response, drug efficacy, and drug toxicity. In 
another aspect of this sixth embodiment the phenotype of said individuals who eipress said detectable trait and the 
phenotype of said individuals who do not express said detectable trait are readily distinguishable from one another. In 
still another aspect of this sixth embodiment the individuals who express said detectable trait and the individuals wiio do 
not express said detectable trait are selected from a bimodal phenotype distribution. In still another aspect of this sixth 
embodiment the individuals who express said detectable trait are at one phenotypic extreme of the population and said 
individuals who do not express said detectable trait are at the other phenotypic extreme of the population. 

A seventh embodiment of the present invention is a method of identifying a haplotype associated with a trait 
comprising the steps of obtaining nucleic add samples from trah positive and trait negative individuals, determining 
the frequencies of the alleles of each member of a group of biallelic markers obtained by thB method of the first 
embodiment which are known to be located proximity to one another in the genome in said nucleic acid samples, and 
identifying a plurality of alleles of biallelic markers having a statistically significant association with said trah. In one 
aspect of this seventh embodiment the detectable trait is selected from the group consisting of disease, drug response, 
drug efficacy, and drug toxicity. 

In another aspect of this seventh embodiment the biallelic markers in each of these groups are located within 
a genomic region spanning less than lkb. In sti another aspect of this seventh embodiment the biallelic markers in each 
of these groups are located within a genomic region spanning from 1 to 5kb. In still another aspect of this seventh 
embodiment the biafleiic markers in each of these groups are located within a genomic region spanning from 5 to 1 0kb. . 
In stil another aspect of this seventh embodiment the biallelic markers in each of these groups are located within a 
genomic region spanning from 10 to 25kb. . In still anotheT aspect of this seventh embodiment the bialleHc markers in 
each of these groups are located within a genomic region spanning from 25 to 50kb. In still another aspect of this seventh 
embodiment, the biallelic markers in each of these groups are located within a genomic region spanning from 50 to 
150kL . In still another aspect of this seventh embodiment the biallelic markers in each of these groups are located 
within a genomic region spanning from 150 to 250kb. in still another aspect of this seventh embodiment the biallelic 
markers in each of these groups are located within a genomic region spanning from 250 to 500kb. In still another aspect 
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of this seventh embodiment, the biallelic markers in each of these groups ere located within a genomic region spanning 
from 500kb to 1Mb. In still another aspoct of this seventh embodiment, the bialfelic markers in each of these groups are 
located within a genomic region spanning more than 1Mb. 

An eighth embodiment of the present invention is a method of identifying one or more biallelic markers 
associated with a detectable trait comprising the steps of selecting a gene in which mutations result in a detectable trait 
or a gene suspected of being associated with a detectable trait and identifying one or more biallelic markers obtained by 
the method of Claim 1 within the genomic region harboring said gene which are associated with said detectable trait In 
one aspect of this eighth embodiment, the detectable trait is selected from the group consisting of disease, drug 
response, drug efficacy, and drug toxicity. In another aspoct of this eighth embodiment, the identifying step comprises 

determining the frequencies of said one or more biallelic markers in individuals who express said detectable 
trait and individuals who do not express said detectable trait and identifying one or more biallelic markers which are 
statistically associated with the expression of said detectable trait 

A ninth embodiment of the present invention is an array of nucleic acids fixed to a support, said nucleic acids 
comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, of one or more biallelic markers 
obtained by the method of the first embodiment In one aspect of this ninth embodiment the nucleic acids comprise at 
least 15 consecutive nucleotides, including the polymorphic nucleotide, of at least five biallelic markers obtained by the 
method of the first embodiment In another aspect of this ninth embodiment 

the nucleic acids comprise at least 6 consecutive nucleotides, including the polymorphic nucleotide, of at least ten 
biallelic markers obtained by the method of the first embodiment 

A tenth embodiment of the present invention is an array of nucleic acids fixed to a support, said nucleic acids 
comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, of one or more groups of biallelic 
markers known to be located in proximity to one another in the genome. 

An eleventh embodiment of the present invention is an array of nucleic acids fixed to a support said nucleic 
acids comprising amplification primers for generating an amplification product comprising at least 8 consecutive 
nucleotides, including the polymorphic nucleotide, of one or more biallelic markers obtained by the method of the first 
embodiment 

A twelfth embodiment of the present invnetion is an array of nucleic acids fixed to a support, said nucleic acids 
of comprising amplification primers for generating an amplification product comprising at least 15 consecutive 
nucleotides, including the polymorphic nucleotide, of one or more groups of biallelic markers known to be located in 
proximity to one another in the genome. 

A thirteenth embodiment of the present invnetion is an array of nucleic acids fixed to a support, said nucleic 
acids comprising one or more microsequendng primers for determining the identity of the polymorphic base of one or 
more nucleic acids comprising at least 15 consecutive nucleotides, including the polymorphic nucleotide, of one or more 
bialteGc markers obtained by the method of the first embodiment 
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A fourteenth embodiment of the present invention is an array of nucleic acids fixed to a support, said nucleic 
nucleic acids comprising one or more microsequencing primers for determining the identity of the polymorphic hoses of 
ona or more groups of hiallefic markers known to be located in proximity to one another in the genome. 

A fifteenth embodiment of the present invention is an array of nucleic acids fixed to a support, wherein said 
5 nucleic acids arc complementary to one or more microsequencing primers for determining the identities of the 

polymorphic bases of one or moo? uialleGc markers obtained by tho method of the first embodiment. In one aspect of 
this fifteenth embodiment, the nucleic acids are complementary to at least five microsequencing primers for determining 
the identities of the polymorphic bases of 8t least five biallelic markers obtained by the method of the first embodiment. 
In another aspect of this fifteenth embodiment the oudeic acids are complementary to at least ten microsequencing 
10 primers for determining the identities of the polymorphic bases of at least ten biallelic markers obtained by the method 

of the first embodiment, 

A sixteenth embodiment of the present invention is an array of nucleic acids fixed ta a support, said nucleic 
acids comprising one or more nucleic acids complementary to ona or more microsequencing primers for determining the 
identity of the polymorphic bases of one or more groups of biallelic markers known to be located in proximity to one 
1 5 another in the genome. 

Another aspect of the present invention is an array of any one of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the members of each of said one or more groups of bialleOc markers are located in plrysical 
proximity to one another on said support . 

Another aspect of the present invention is an array of any one of Claims of the tenth, twelfth, fourteenth or 
20 sixteenth embodiments, wherein said biallelic markers in each of these groups are located within a genomic region 

spanning less than 1kb. 

Another aspect of the present invention is on array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein said biaflelic markers in each of these groups are located within a genomic region spanning from 1 
toBkh. 

25 Another aspect of the present invention is an anay of any ons of of the tenth, twelfth, fourteenth or sixteenth 

embodiments, wherein the biaflelic markers in each of these groups are located within a genomic region spanning from 5 
tolOkb. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the Uaflelic markers in each of these groups are located within a genomic region spanning from 
30 I0to25kb. 

Another aspect of the present invention is an anay of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the biallelic markers in each of these groups are located within a genomic region spanning from 
25to50kb. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
35 embodiments, wherein the biallelic markers in each of these groups are located within a genomic region spanning from 

50 to 1 SOkb. 
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Another aspect of the present invention is an array of any one of of tho tenth, tweJhh, fourteenth or sixteenth 
embodiments, wherein the bialielic markers in each of these groups are located within a genomic region spanning from 
150to250kb. 

Another aspect of the present mventfon is an array of any ona of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the bialielic markers in each of these groups are located within a genomic region spanning from 
250 to 50014. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth nr sixteenth 
embodiments, wherein the bialielic markers in each of these groups are located within a genomic region spanning from 
500kb to 1Mb. 

Another aspect of tha present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the bialielic markers in each of these groups are located within a genomic region spanning more 
than 1Mb. 

Another aspect of the present invention is an array of any one of of the tentfu twelfth, fourteenth or sixteenth 
embodiments, wherein each group of biallefie markers comprises at least 3 bialielic markers. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein each group of bialielic markers comprises at least 6 bialielic markers. 

Another aspect of the present invention is an array of ony one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein each group of bialielic markers comprises at least 20 biallefic markers. 

A seventeenth embodiment of the present invention is a method for determining whether an individual is at risk 
of developing a detectable trait or suffers from a detectable trait associated with said trait comprising the steps of 
obtaining a nucleic acid sample from said individual, screening said nucleic acid sample with one or more bialielic markers 
obtained by the method of the first embodiment, and determining whether said nucleic acid sample contains one or more 
of bialielic markers statistically associated with said detectable trait. I one aspect of this seventeenth embodiment, the 
detectable trait is selected from the group consisting of disease, drug response, drug efficacy and drug toxicity. In 
another aspect of this seventeenth emobhnent the bialielic markers were obtained by the method of the sixth 
embodiment In another aspect of this seventeenth embodiment, the bialielic markers were obtained by the method of 
the eighth embodiment 

An eighteenth embodiment of the present invention is a method of using a drug comprising obtaining a nucleic 
acid sample from an individual, determining the identity of the polymorphic baso of one or more bialielic markers obtained 
by the method of the first embodiment which is associated with a positive response to treatment with said drug or one 
or more biallefic markers obtained by tha method of the first embodiment which is associated with a negative response 
to treatment with said drug, and administering said drug to said individual if said nucleic acid sample contains one or 
more bialleEc markers associated with a positive response to treatment with said drug or if said nucleic acid sample 
lacks one or more bialielic markers associated with a negative response to said drug. In one aspect of this eighteenth 
embodiment, the determining step comprises determining the identity of the polymorphic base of one or more bialielic 
markers obtained by the method of the aspect of the sixth embodiment wherein the trait is drug response which is 
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associated with a positive response to treatment with said drug or one or more bialtelic markers ohtained by the aspect 
of the sirtli embodiment wherein the trait is drug response which is associated with a negative response to treatment 
with said drug. In another aspect of this eighteenth embodiment, the determining step comprises determining the 
identity of the polymorphic base of one or more biallelic markers obtained by the aspect of the eighth embodiment 
wherein the trait is drug response which is associated with a positive response to treatment with said drug or one or 
more biallelic markers obtained by the method of the aspect of tlie eighth embodiment wherein the trait is drug response 
which is associated with a negative response to treatment with said drug. 

A nineteenth embodiment of the present invention is a method of selecting an individual for inclusion in a 
clinical trial of a drug comprising obtaining a nucleic acid sample from an individual, determining the identity of the 
polymorphic base of one or more biaUelic markers ohtained by the method of the first embodiment which is associated 
with a positive response to treatment with said drug or one or more biallelic markers associated with a negative 
response to treatment with said drug in said nucleic acid sample, and including said individual in said clinical trial if said 
nucleic acid sample contains one or more biallelic markers obtained by the method of the first embodiment which is 
associated with a positive response to treatment with said drug or if said nucleic acid sample lacks one or more biallelic 
markers associated with a negative response to said drug. In one aspect of this nineteenth embodiment, the determining 
step comprises determining the identity of the polymorphic base of one or more biallelic markers obtained by the aspect 
of the sixth embodiment wherein the trait is drug response which is associated with a positive response to treatment 
with said drug or one or more biallelic markers obtained by the aspect of the sixth embodiment wherein the trait is drug 
respons which is associated with a negative response to treatment with said drug. In another aspect of this nineteenth 
embodiment the determining step comprises determining the identity of the polymorphic base of one or more biallelic 
markers obtained by the aspect of the.eighth embodiment wherein the trait is drug response which is associated with a 
positive response to treatment with said drug or one or more biallelic markers obtained by the aspect of the eighth 
embodiment wherein the trait is drug response which is associated with a negative response to treatment with said 
drug. 

A twentieth embodiment of the present invention is a method of identifying a gene associated with a 
detectable trait comprising the steps of determining the frequency of each allele of one or more biallelic markers 
obtained by the method of the first embodiment in individuals having said detectable trait and individuals lacking said 
detectable trait iderrtifying one or more alleles of one or more biallelic markers having a statistically significant 
association with said detectable trait arid identifying a gene in linkage disequilibrium with said one or mora alleles, 
in one aspect of this twentieth embodiment the method further comprises identifying a mutation m the gene which is 
associated with said detectable trait In another aspect of this twentieth embodiment the detectable trait is selected 
from the group consisting of disease, drug response, drug efficacy, and drug toxicity. 

A twenty-first embodiment of the present invention is a method of identifying a gene associated with a 
delectable trait comprising selecting a gens suspected of being associated with a detectable trait and identifying 
one or more bialtelic markers obtained by the method of the first embodiment within the genomic region harboring said 
gene which are associated with said detectable trait In one aspect of this twenty-first embodiment the detectable trait 
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is selected from the group consisting of disease, drug response, drug efficacy, and drug toxicity. In another aspect of 
this twenty-first embodiment, the identifying step comprises determining the frequencies of said one or more biallelic 
markers in individuals who express said detectable trait end individuals who do not express said detectable trait and 

identifying one or more biallelic markers which arc statistically associated with the expression of said 
detectable trait. 

A twenty-second embodiment of the present invention is a method of identifying a haplotype associated with 
a trait comprising the steps of obtaining nucleic acid samples from trait positive and trait negative individuals, 

conducting an amplification reaction on said nucleic acid samples using amplification primBrs capable of 
generating amplification products containing the polymorphic basts of a plurality of biallelic markers, contacting one or 
more arrays according to the tenth embodiment with said amplification products, determining the identities of the 
polymorphic bases cf said amplification products, and identifying a haplotype having a statistically significant 
association with said trait 

A twenty-third embodiment of the present invention is a method of identifying a haplotype associated with a 
trait comprising the steps of obtaining nucleic acid samples from trait positive and trait negative individuals, conducting 
amplification reactions on said nucleic acid samples using amplification primers capable of generating amplification 
products containing the polymorphic bases of a plurality of biallelic markers, contacting one or more arrays according to 
the fourteenth embodiment with said amplification products, conducting microsequencing reactions on said 
amplification products using microsequencing primers on said arrays, thereby generating Elongated microsequencing 
primers comprising the polymorphic bases of said amplification products, determining the identities of said polymorphic 
bases, and identifying a haplotype having a statistically significant association with said traiL 

A twenty-fourth embodiment of the present invention is a method of identifying a haplotype associated with a 
trait comprising the steps of obtaining nucleic arid samples from trait positive and trait negative individuals, conducting 
amplification reactions on said nucleic acid samples uisng amplification primers which are capahle of generating 
amplification products containing the polymorphic bases of a plurality of biallelic markers, conducting microsequencing 
reactions on said nucleic add samples, thereby generating microsequencing products containing the polymorphic bases 
of one or more biallelic markers at their 3* ends, said poh/morpluc bases being detectably labeled, contacting ona or more 
arrays according to the sixteenth embodiment with said microsequencing products such that said microsequencing 
products specifically hybridize to said nucleic adds complementary to said microsequencing primers, determining 
the identities of the polymorphic bases of said microsequencing products, and identifying a haplotype having a 
statistically significant association with said traiU 

A twenty-fifth embodiment of the present invention is a method of identifying a haplotype associated with a 
trait comprising the steps of obtaining nucleic acid samples from trait positive and trait negative individuals, contacting 
one or more arrays according to the twelfth embodiment with said nucleic acid sample, conducting en amplification 
reaction on said nucleic acid samples using amplification primers on said array which are capable of generating 
amplification products containing the polymorphic bases of a plurality of biallefic markers, determining the identities of 
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the polymorphic bases of said amplification products, and identifying a haplotype having a statistically significant 
association with said trait. 

A twenty-sixth embodiment of the present invention is a method of determining whether an individual is at risk 
of developing Alzheimer's discaso or whether the individual suffers from Alzheimer's disease as a result of possessing 
the Apo E e4 Site A allele comprising obtaining 8 nucleic add sample from said individual, and determining the identity 
of the polymorphic base in one or more of the sequences selected from the group consisting of SEQ ID Nos. 301 -305 and 
SEQ ID Nos. 307-31 1 or the sequences complementary thereto in said nucleic acid sample. In one aspect of this twenty- 
sixth embodiment, the method further comprises determining whether said nucleic acid sample contains the sequence of 
SEQ ID No. 3DG or the sequence complementary thereto. In another aspect of this twenty-sixth embodiment, the step of 
determining the identity of the polymorphic bases in one or more of the sequences selected from the group consisting of 
SEQ ID Nos. 301-305 and SEQ ID Nos. 307-311 or the sequences complementary thereto comprises determining 
whether said nucleic acid sample contains the sequence of SEQ ID NO. 311 (the T allele of marker 99-365/344) or the 
sequence complementary thereto. In another version of the preceding aspect, the furtiier comprises determining whether 
said nucleic acid sample contains the sequence of SEQ ID No. 30G or the sequence complementary thereto. 

A twenty-seventh embodiment of the present invention is an isolated nucleic acid comprising a sequence 
selected from the group consisting of SEQ ID No. 301, SEQ ID No. 307, th8 sequences complementary thereto, and 
fragments comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, thereof. 

A twenty-eighth embodiment of the present invention is an isolated nucleic acid comprising a sequence 
selected from the group consisting of SEQ ID No. 302 t SEQ ID No. 308. the sequences complementary thereto, and 
fragments comprising at least 8 consecutive nucleotides thereof. 

A twenty-ninth embodiment of the present invention is an isolated nucleic acid comprising a sequence selected 
from the group consisting of SEQ ID No. 303, SEQ ID No. 309. the sequences complementary thereto, and fragments 
comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, thereof. 

A thirtieth embodiment of the present invention is an isolated nucleic acid comprising a sequence selected from 
the group consisting of SEQ ID No. 304, SEQ 10 No. 310 , the sequences complementary thereto, and fragments 
comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, thereof. 

A thirty first embodiment of the present invention is an isolated nucleic acid comprising a sequence selected 
from the group consisting of SEQ ID No. 305, SEQ ID No. 311, the sequences complementary thereto, and fragments 
comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, thereof. 

A thirty second embodiment of the present invention is an isolated nucleic acid comprising a sequence selected 
from the group consisting of SEQ ID Nos. 313-317, SEQ ID Nos. 319-323, and fragments comprising at least 8 
consecutive nucleotides thereof, 

A thirty third embodiment of the present invention is isolated nucleic acid comprising a sequence selected from 
the group consisting of SEQ ID Nos. 325-329, SEQ ID Nos. 331-335, the sequence complementary thereto, and 
fragments comprising at least 8 consecutive nucleotides thereof. 
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A thirty fourth embodiment of the present invention is set of nucleic acids comprising at least 6 consecutive 
nucleotides, including the polymorphic nucleotide, of one or more biaflefic markers obtained by the method of the first 
embodiment. 

A thirty fifth embodiment of the present invention is a set of nucleic acids comprising amplification primers for 
generating an amplification product comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, 
of one or more biaflefic markers obtained by the method of the first embodiment, 

A thirty sixth embodiment of the present invention is a set of nucleic acids comprising one or more 
microsequencing primers for determining the identity of the polymorphic basa of one or more nuclsic acids comprising at 
least 8 consecutive nucleotides, including the polymorphic nucleotide, of one or more biallelic markers obtained by the 
method of the first embodiment. 

Brief Description of the Drawings 
Figure 1 is a cytogenetic map of chromosome 21. 

Figure 2a shows the results of a computer simulation of the distribution of inter-marker spacing on a randomly 
distributed set of bialtolic markers indicating the percentage of biallelic markers which will be spaced a given distance 
apart for 1. 2. or 3 markers/BAC in a genomic map {assuming a set of 20,000 minimally overlapping BACs covering the 
genome are evaluated). 

Figure 2b shows the results of a computer simulation of the distribution of inter-marker spacing on a randomly 
distributed set of biallelic markers indicating the percentage of biallelic markers which will be spaced a given distance 
apart for 1, 3, or 6 markers/BAC in a genomic map {assuming a set of 20,000 minimally overlapping BACs covering the 
genome are evaluated). 

Figure 3 shows, for a series of hypothetical sample sires, the p-value significance obtained in association 
studies performed using individual markers from the high-density biallelic map, according to various hypotheses regarding 
the difference of allelic frequencies between the T+ andT- samples. 

Figure 4 is a hypothetical association analysis conducted with a map comprising about 3.000 biallelic markers. 

Figure 5 is a hypothetical association analysis conducted with a map comprising about 20,000 biallelic 

markers. 

Figure 6 is 8 hypothetical association analysis conducted with a map comprising about 60,000 biallelic 

markers. 

Figure 7 is a haplotype analysis using biallelic markers in the Apo E region. 

Figure 8 is a simulated haplotype analysis using tha biallelic markers m the Apo E region included in the 
haplotype analysis of Figure 7. 

Figure 9 shows a minimal array of overlapping clones which was chossn for f unher studies of biallelic markers 
associated with prostate cancer, the positions of STS markers known to map In the candidate genomic region along the 
contig, and the locations of biallelic markers along the BAC contig harboring a genomic region harboring a candidate gene 
associated with prostate cancer which were identified using the methods of the present invention. 
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fiflure 10 is a rough localiration of a candidate gene for prostate cancer which was obtained by determining 
the frequencies of the biallelic markers of Figure 9 in affected and unaffected populations. 

Figure 1 1 is a further refinement of the localization of the candidate gene for prostate cancer using additional 
biallelic markers which were not included in the rough localization illustrated in Figure 10. 

Figure 12 is a haplotype analysis using the bialtefic markers in the genomic region of the gene associated with 
prostate cancer. 

Figure 13 is a simulated haplotypc using the six markers included in haplotype 5 of Figure 12. 

Detailed Description of the Preferred Fmhodiment 
The human haploid genome contains an estimated 80,O0D to 100,000 or mora genes scattered on a 
3 x 10 9 base-long double stranded DNA shared among the Z4 chromosomes. Each luiman being is diploid. Lc. possesses 
two haploid genomes, one from paternal origin, the other from maternal origin. The sequence of the human genome 
varies among individuals in a population. About 10 7 sites scattered along the 3 x 10 9 base pairs of DNA are polymorphic, 
existing in at least two variant forms called alleles. Most of these polymorphic sites are generated by single base 
substitution mutations and are KaHeCc. Loss than 10 s polymorphic sites are due to more complex changes and are very 
often muUi-olleitc Le. exist in more than two allelic forms. At a given polymorphic site, any individual (diploid), can be 
either homozygous (twice the same allele) or heterozygous {two different alleles). A given polymorphism or rare mutation 
can be either neutral (no effect on trait), or functional lc. responsible for a particular genetic trait. 

Genetic Maps 

The first step towards the identification of genes associated with a detectable trait such as a disease or any 
other detectable trait, consists in the localization of genomic regions containing trait-causing genes using genetic 
mapping methods. The preferred traits contemplated within the present invention relate to fields of therapeutic interest; 
in particular embodiments, they will be disease traits and/or drug response traits, reflecting drug efficacy or toxicity. 
Traits can either be "binary", e,g, diabetic vs. non diabetic, or "quantitative', e.g. elevated blood pressure. Individuals 
affected by a quantitative trait can be classified according to an appropriate scale of trait values, e.g. blood pressure 
ranges. Each trait value range can then be analyzed as a binary trait. Patients showing a trait value within one such 
range will be studied in comparison with patients showing a trait value outside of this range. In such a case, genetic 
analysis methods win be applied to subpopulations of individuals showing trait values within defined ranges. 

Genetic mapping involves the analysis of the segregation of polymorphic loci in trait 
positive and trait negative populations. Polymorphic loci constitute a small fraction of the human 
genome (less than 1%), compared to the vast majority of human genomic DNA which is identical in 
sequence among the chromosomes of different individuals. Among all existing human polymorphic 
loci, genetic markers can be defined as genome-derived polynucleotides which are sufficiently 
polymorphic to allow a reasonable probability that a randomly selected person will be heterozygous, 
and thus informative for genetic analysis by methods such as linkage analysis or association studies. 
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A genetic map consists of a collection of polymorphic markers which have been positioned on the human 
chromosomes. Genetic maps may he combined with physical maps, collections of ordered overlapping fragments of 
genomic ONA whose arrangement along the human chromosomes is known. The optimal genetic map should possess 
the following characteristics: 

- the density of the genetic markers scattered along the genome should be sufficient to allow the identification and 
localization of any trait-related polymorphism. 

- each marker should have an adequate level of heterozygosity, so as to be informative in a large percentage of different 
meioses, 

- all markers should be easily typed on a routine basis, at a reasonable expense, and in a reasonable amount of time, 

- the entire set of markers per chromosome should bo ordered in a highly reliable fashion. 

However, while the above maps are optimal, it will be appreciated that the maps of the present invention may 
be used in the the individual marker and haplotype association analyses described below without the necessity of 
determining the order of biallelic markers derived from a single BAC with respect to one another. 

Genetic Maes Based on RFLPs or VNTRs 

TliQ analysis of DNA polymorphisms has relied on the following types of polymorphisms. The first generation 
of genetic markers were restriction fragment length polymorphisms (RFLPs), single nucleotide polymorphisms which 
occur at restriction sites, thereby modifying the cleavage pattern of the corresponding restriction enzyme. Though the 
original methods used to type RFLPs were material-, effort- and time-consuming, today these markers can easily be 
typed by PCR-based technologies. Since they are biallelic markers (they present only two alleles, the restriction site 
being either present or absent), their maximum heterozygosity is 0.5. The theoretical number of RFLPs distributed along 
the entire human genome is more than 10* , which leads to a potential average inter-marker distance of 30 kiiobases. 
However, in reality the number of evenly distributed RFLPs which occur at a sufficient frequency m the population to 
make them useful for tracking of genetic polymorphisms is very limited. 

The second generation of genetic markers was VNTRs (Variable Number of Tandem Repeats), which can be 
categorized as either minisateltites or microsatelutes* Minisatellites are tandemly repeated DNA sequences present in 
units of 5-50 repeats which are distributed along regions of the human chromosomes ranging from Q.1 10 20 kiiobases in 
length. Since they present many passible alleles, their polymorphic informative content is very high. Minisatellites are 
scored by performing Southern blots to identify the number of tandem repeats present in a nucleic acid sample from the 
individual being tested. However, there are only 1Q 4 potential VNTRs that can be typed by Southern blotting- 

Microsatellites (also called simple tandem repeat polymorphisms, or simple sequence length polymorphisms) 
constitute the most developed category of genetic markers. They include small arrays of tandem repeats of simple 
sequences (di-tri-tetra- nucleotide repeats) which exhibit a high degree of length polymorphism end thus a high level of 
tftformativeness. Slightly more than 5,000 microsatellites easily typed by PCR-derived technologies, have been ordered 
along the human genome (Dib et al., NaturB 380:152 (1986), the disclosure of which is incorporated herein by 
reference). 
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A number of these available microsatellites were used to construct integrated physical and genetic maps 
containing less than 5,000 markers. For example, CEPH (Chumakov et aL, Nature 377: 1 75-298 (1995) and CohBn et aL, 
mature 366: 630-701 (1933) , the disclosures of which are incorporated herein by reference), and Whitehead Institute 
and GdnSthon (Hudson et ah, 1995). constructed genetic and physical maps covering 75% to 95% of the human genome, 
based on 2500 to 5000 microsatelfite markers. 

However, the number cf easily typed informative markers in these maps was too small for the average 
distance between informative markers to fulfill the aboveTisted requirements for genetic maps. 

Biallelic Markers 

Biallelic markers are genome-derived polynucleotides which exhibit biallefic polymorphism. As used herein, the 
term biallelic marker means a biallelic single nucleotide polymorphism. As used herein, the term polymorphism may 
include a single base substitution, insertion, or deletion. By definition, the lowest allele frequency of a biallelic 
polymorphism is 1% (sequence variants which show allele frequendes below 1% are celled rare mutations}. There are 

7 

potentially more than 10 biallelic markers which can easily be typed by routine automated techniques, such as 
sequence- or hybridization-based techniques, out of which 1Q C are sufficiently informative for mapping purposes. 
However, a biallelic marker will show a sufficient degree of informativeness for use in genetic mapping only if the 
frequency of its loss frequent allele is not less than about 10% (i.e. a heterozygosity rate of at least 0.18) (the 
heterozygosity rate for a biallefic marker is 2 P c (1-PJ « where P a is the frequency of allele a). Preferably, the frequency 
of the less frequent allele of the biaMc markers in the present maps is at least 20% (i.c a heterozygosity rate of at 
feast 0.32). More preferably, the frequency of the less frequent allele of the biallelic markers in the present maps is at 
least 30% (Lb, its heterozygosity rate is higher than about 0.42). 

Initial attempts to construct genetic maps based on non-RFLP biallelic markers have focused on identifying 
biallelic markers tying within sequence tagged sites (SIS), pieces of genomic ONA having a known sequence and 
averaging about 250 bases in length. More than 30,000 STSs have been identified and ordered along the genome 
(Hudson et aL, Same 270:1945-1954 (1995); Schufer et *K Science 274:540-546 (1996), the disclosures of which 
are incorporated herein by reference). For example, the Whitehead Institute and Gencthon's integrated map contains 
15,086 STSs. 

These sequence tagged sites can be screened to identify polymorphisms, preferably Single Nucleotide 
Polymorphisms (SNPs), more preferably non RFIP biallelic markers therein. Generally polymorphisms are identified by 
determining the sequence of the STSs in 5 to 1 0 individuals. 

Wang et a!. (Cold Spring harbor laboratory: Abstracts of papers pressented on genome Mapping and 
sequencing yA7 (May 14-18, 1997), the disclosure of which is incorporated herein by reference) recently announced the 
identification and mapping of 750 Single Nucleotide Polymorphisms issued from the sequencing of 12,000 STSs from 
the Whitehead/MIT map, in eight unrelated individuals. The map was assembled using a high throughput system based 
on the utilization of ONA chip technology available from Affymetrix (Chee et aL, Science 274:610-614 (1996), the 
disclosure of which is incorporated herein by reference). 
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However, according to experimental data and statistical calculations, less than one out of 10 of all STSs 
mapped today will contain an informative Single Nucleotide Polymorphism. This is primarily due to the short length of 
existing STSs (usually less than 250 bp). If one assumes 10* informative SNPs spread along the human genome, There 
would on average be one marker of interest every 3X1Q 9 /10 6 , i.e. every 3,000 hp. Ths probability that one such marker 
is present on a 250 bp stretch is thus less than 1/10. 

While it could produce a high density map, the STS approach based on currently existing markers docs not put 
any systematic effort into making sure that the markers obtained are optimally distributed throughout the entire 
genome. Instead, polymorphisms are limited to those locations for which STSs are available. 

The even distribution of markers along the chromosomes is critical to the future success of genetic analyses. 
In particular, a high density map having appropriately spaced markers is essential for conducting association studies on 
sporadic cases, aiming at identifying genes responsible for detectable traits such as those which are described below. 

As will he further explained below, genetic studies have mostly relied in the past on a statistical approach 
called linkage analysis, which took advantage of microsatcllite markers to study their inheritance pattern within families 
from which a sufficient number of individuals presented the studied trait. Because of intrinsic limitations of linkage 
analysis, which will be further detailed below, and because these studies necessitate the recruitment of adequate lamily 
pedigrees, they are not well suited to the genetic analysis of all traits, particularly those for which only sporadic cases 
are available (eg. drug response traits), or those which have a low penetrance within the studied population. 

Association studies offer an alternative to linkage analysis. Combined with the use of a high density map of 
appropriately spaced, sufficiently informative markers, association studies, including linkage disequilibrium-based 
genome wide association studies,wiU enable the identification of most genes involved in complex traits. 

The present invention relates to a method for generating a high density linkage disequilibrium-based genetic 
map of the human genome which will allow the identification of sufficiently informative markers spaced at intervals 
which permit their use in identifying genes responsible for detectable traits using genome-wide association studies and 
linkage disequilibrium mapping. 

Construction of a Physical Map 
The first step in constructing a high density genetic map of bialleOc markers is the construction of a physical 
map. Physical maps consist of ordered, overlapping cloned fragments of genomic DNA covering a portion of the genome, 
preferably covering one or all chromosomes. Obtaining a physical map of the genome entails constructing 8nd ordering a 
genamic DNA library. 

Physical mapping in complex genomes such as the human genome (3.000 Megabases) requires the construction 
of DNA libraries containing large inserts (on the order of 0.1 to 1 Megabase). It is crucial that such libraries be easy to 
construct, screen and manipulate, and that the DNA inserts be stable and relatively free of chimerism. 

Yeast artificial chromosomes (YACs; Binke et aL Science 236:B06-8l2 (1987). the disclosure of which is 
incorporated herein by reference) have provided an mvaluabls tool in the analysis of complex genomes since their cloning 
capacity is extremely high (in the Mb range). YAC libraries containing large DNA inserts (up to 2 Mb) have been used to 
generate STS-content maps of individual chromosomes or of the entire human genome (Chumakov et al. (1995). supra; 
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Hudson et al. (1995), supnr, Cohen et al.. Nature 366: 698701 (1993; Chumakov et aU Nature 359:380-387 {1992); 
Gemmitt et at Nature 377:299-319 (1995); Ooggctt ct aL, Nature 377:335-365 (1995); the disclosures of which are 
incorporated herein by reference). 

The present genetic maps may he constructed using currently available YAC genomic libraries such as the 
CEPH human YAC fibrary as a starting material (Chumakov et aL (1995), supra). Alternatively, ona may construct a 
YAC genomic library as described in Cbumakov et aL, 1995 f the disclosure of which is incorporated herein by reference, 
or as described below. 

Once a YAC genomic library has been obtained the genomic DNA fragments therein are ordered. Ordering may 
be porformed directly on the genomic DNA in the YAC library. However, direct ordering of YAC inserts is not pi eferred 
because YAC libraries often exhibit a high rate of chimerism (40 to 50% of YAC clones contain fragments from more 
than one genomic region), often suffer from clonal instability within their genomic DNA inserts, and require tedious 
procedures to manipulate and isolate the insert DNA. Instead, it is preferable to conduct the mapping and sequencing 
procedures required for ordering the genomic DNA in a system which enables the stable cloning of large inserts while 
being easy to manipulate using standard molecular biology techniques. 

Accordingly, it is preferable to clone tho genomic DNA into bacterial single copy plasrnids, for example DACs 
(Bacterial Artificial Chromosomes), rather than into YACs. Bacterial artificial chromosomes arc well suited for use in 
ordering genomic DNA fragments. BACs provide a low rate of chimerism and fragment rearrangement together with 
relative ease of insert isolation. Thus BAC libraries are well suited to integrate genetic, STS and cytogenetic 
information while providing direct access to stable, readily-sequenceable genomic DNA. An example of bacterial artificial 
chromosome is the BAC cloning system of Shizuya et aL which is capable of stably propagating and maintaining 
relatively large genomic DNA fragments (up to 300 kb long) as single-copy plasmitls in Ecati (Shizuya et a!. f Proc. Natl 
Acad. Sti USA 89:8794-8797 (1S92), the disclosure of which is incorporated herein by reference). 

Example 1 describes the construction of a BAC library containing human genomic DNA. It will be appreciated 
that the source of the genomic DNA, the enzymes used to digest the DNA r the vectors into which the genomic DNA is 
inserted, end the size of the DNA inserts which are cloned into said vectors need not bo identical to those described in 
Exampte 1 below. Bather, the genomic DNA may be obtained from any appropriate source, may be digested with any 
appropriate enzyme, and may be cloned into any suitable vector. Insert size may vary within any range compatible with 
thB cloning system chosen and with the intended purpose of the library being constructed. Typically, using BAC vectors 
to construct DNA libraries covering the entire human genome, insert size may vary between 50kb and 300 kb, preferably 
lOOkb and2Q0kfc 

Example 1 
Construction of a BAC library 
Three different human genomic DNA libraries were produced by cloning partially digested DNA from a human 
lymphoblastoid cell line (derived from individual N° 8445, CEPH families) into the pBeloBACII vector (Kim et a!.. 
Genomics 34213-218 (1996), the disclosure of which is incorporated herein by reference). One fibrary was produced 
using a BamHl partial digestion of the genomic DNA from the lymphoblastoid cell line and contains 110,000 clones 
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having an average insert size of 150 kb (corresponding to 5 human hapioid genome equivalents). Another library was 
prepared from a Hindi!) partial digest and corresponds to 3 human genome equivalents with an average insert size of 
150kb. A third fihrary was prepared from a Ndel partial digest and corresponds to 4 human genome equivalents with an 
average insert size of 1 SOkb. 

Alternatively, the genomic DNA may be inserted into BAC vectors which possess both a high copy number 
origin of replication, which facilitates the isolation of the vector DNA, and a low copy number origin of replication. 
Cloning of a genomic DNA Insert into the high copy number origin of replication inactivates the origin such that clones 
containing a genomic insert replicate at low copy number. The low copy number of clones having a genomic insert 
therein permits the inserts to be stably maintained. In addition, selection procedures may be designed which enable low 
copy number plasmids (i.e. vectors having genomic inserts therein) to be selected. Such vectors and selection procedures 
are described in the U.S. Patent Application entitled 'High Tliroughput DNA Sequencing Vector* (GENSET.015A. Serial 
No. 09/058,746), the disclosure of which is incorporated herein by reference. 

It will be appreciated that the present methods may be practiced using BAC vectors other than those of 
Shizuya et aL (1992, supnA, er derived from those, or vectors other than BAC vectors which possess the above- 
described characteristics. 

To construct a physical map of the genome from genomic DNA libraries, the library clones have to be ordered 
along the human chromosomes, in a preferred embodiment, a minimal subset of the ordered clones wfll then be chosen 
that completely covers the entire genome. 

For example the genomic DNA in the inserts of the above described BAC vectors are ordered using STS markers whose 
positions relative to one another and locations along the genome are known using procedures such as those described 
herein. The STS markers used to order the BAC inserts may be the STS markers contained in the integrated maps 
described above. Alternatively, the STSs may be STSs which are not contained in any of the physical maps described 
above. In another embodiment the STSs may be a combination of STSs included in the physical maps described above 
and STSs which are not included in the integrated maps described above. 

The BAC vectors are screened with STSs until there is at least one positive BAC clone per STS. Preferably, a 
minimally overlapping set of 10.000 to 30,000 BACs having genomic inserts spanning the entire human genome are 
identified. More preferably, a nnnimafly overlapping set of 10,000 to 30,000 BACs having genomic inserts of about 100- 
3Q0kb in length spanning the entire human genome are Identified. In a preferred embodiment, a minimally overlapping set 
of 10,000 to 30,000 BACs having genomic inserts of about 100-150 kb *m length spanning the entire human genome is 
identified, in a highly preferred embodiment, a minimally overlapping set of 1S r 000 to 25,000 BACs having genomic 
inserts of about 100-200 leb in length spanning the entire human genome is identified. Alternatively, a smaller number of 
BACs spanning a set of chromosomes, a single chromosome, a particular subchromosbmal region, or any other desired 
portion of the genome may be ordered. The BACs may be screened for the presence of STSs as described in Example 2 
below. 
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Example 2 

Orderinn of a BAC Library: Screen ing Clnnos with STSs 

The BAC library is screened with a set of PCntypeable STSs to identify clones containing the STSs. To 
facilitate PCB screening gf several thousand clones, for example 200,000 clones, pools of clones aro prepared. 

Three-dimensional pools of the BAC libraries ate prepared as described in Chumakov et a!, and are screened for 
the ability to generate an amplification fragment in amplification reactions conducted using primers derived from the 
ordered STSs. (Chumakov et aL (1 995). supr*)- A BAC library typically contains 200,000 BAC clones. Since the average 
size of each insert is 100-300 kb, the overall size of such a library is equivalent *to the size of at least about 7 human 
genomes. This library is stored as an array of individual clones in 518 384-wc« plates. It can be divided into 74 primary 
pools (7 plates eachh Each primary pool can then be divided into 48 subpools prepared by using a three-dimensional 
pooling system based on the plate, row and column address of each clone (more particularly, 7 subpools consisting of all 
clones residing in a given microtiter plate; 16 subpools consisting of all clones in a given row; 24 subpools consisting of 
all clones in a given column). 

Amplification reactions ore conducted on the pooled BAC clones using primers specific for the STSs. For 
example, the three dimensional pools may be screened with 45,000 STSs whose positions relative to one another and 
locations along the genome are known. Preferably, the three dimensional pools are screened with about 30,000 STSs 
whose positions relative to one another and locations along the genome are known. In a highly preferred embodiment, 
the three dimensional pools are screened with about 20,000 STSs whose positions relative to one another and locations 
along the genome are known. 

Amplification products resulting from the amplification reactions are detected by conventional agarose gel 
electrophoresis combined with automatic image capturing and processing. PCH screening for a STS involves three 
steps: (1) identifying the positive primary pools; (2) for each positive primary pool, identifying the positive plate, row and 
column 'subpools' to obtain the address of the positive clone; (3) directly confirming the PCR assay on the identified 
clone. PCR assays are performed with primers specifically defining the STS. 

Screening is conducted as follows. First BAC DNA containing the genomic inserts is prepared as follows. 
Bacteria containing the BACs are grown overnight at 37°C in 120 jj\ of LB containing chloramphenicol (12 ^g/ml). DNA 
is extracted by the following protocol: 

Centrifuge 10 min at 4°C and 2000 rpm 

Eliminate supernatant and resuspend pellet in 120 pi TE 10-2 (Tris HC1 10 mM, EDTA 2 mM) 
Centrifuge 10 min at 4°C and 2000 rpm 

Eliminate supernatant and incubate pellet with 20 fA lyzozyme 1 mg/ml during 1 5 min at room temperature 
Add 20 iA proteinase K 100//g/ml and incubate 15 min at 60°C 
Add 8 fA DNAsa 2U///I and incubate 1 hr at room temperature 
Add 1O0 lA TE 10-2 and keep at-80°C 
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PCR assays are performed using the fallowing protocol: 



Final volume 



15//I 



BAC DNA 



1.7 ngfcil 



MgCl 2 
dNTP {each} 
primer (each) 

Ampfi Taq Gold DNA polymerase 



2mM 



200 pM 
2.9 ngtpl 



0.05 unit//^ 



PCR buffer (10x - 0.1 M TfisHCI pHB.3 0.5M KCI 1 1 

The amplification is performed on a Genius II ihermocycler. After heating at 95°C for 10 miii, 40 cycles are performed. 
Each cycle comprises: 30 sec at 95°C, 54°C for 1 min, and 30 sec at 7Z°C. For final elongation. 10 min at 72°C end 
the amplification. PCR products are analyzed on 1 % agarose pi with 0.1 mg/ml ethidium bromide. 

Alternatively, a YAC (Yeast Artificial Chromosome) library can be used. The very large insert size, of the order 
of 1 megabase, is the main advantage of the YAC libraries. The library can typically include about 33,000 YAC clones as 
described in Chumakov et aL (1995, supra). Tha YAC screening protocol may be the same as the one used for BAG 
screening. 

Tha known order of the STSs is then used to align tha BAC inserts in an ordered array (contig) spanning the 
whole human genome. If necessary new STSs to be tested can be generated by sequencing the ends of selected BAC 
inserts. Subchromosomal localization of the BACs can be established and/or verified by fluorescence in situ hybridization 
(FISH), performed on me tap ha sic chromosomes as described by Cherif et al. 1990 and in Example B bciow. BAC msert 
size-may be determined by Pulsed field Gel Electrophoresis after digestion with tha restriction enzyme NotL 

Finally, a minimally overlapping set of BAC clones, with known insert size and subchromosomal location, 
covering the entire genome, a set of chromosomes, a single chromosome, a particular subchromosomal region, or any 
other desired portion of the genome is selected from the DNA library. For example, the BAC doncs may cover at least 
TOOfcb of contiguous genomic DNA. at least 250kb of contiguous genomic DNA, at least 500kb of contiguous genomic 
DMA, at least 2Mb of contiguous genomic DNA, at least 5Mb of contiguous genomic DNA, at least 10Mb of contiguous 
genomic DNA r or at least 20Mb of contiguous genomic DNA. 

Identification of biallefic markers 
In order to generate polymorphisms having the adequate informative content to be used as biallefic markers for 
genetic mapping, the sequences of random genomic fragments from an appropriate number of unrelated individuals ara 
compared. Genomic sequences to be screened for biallelic markers may be generated by partially sequencing BAC 



inserts, preferably by sequencing the ends of BAC subclones. Sequencing the ends of an adequate number of BAC 
subclones derived from a minimally overlapping array of BACs such as those described above will allow the generation of 
biallelic markers spanning the entire genome, a set of chromosomes, a single chromosome, a particular subchromosomal 
region, or any other desired portion of the genome with an optimizedinter-rnarker spacing. 
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Thus, portions of the BACs in the selected ordered array are then subcloned and sequenced using, for example, 
the procedures described below. 

Example 3 
Snhrionino of BACs 

The cells obtained from three liters overnight culture of each BAC clone are treated by alkaline lysis using 
conventional techniques to obtain the BAC DNA containing the genomic DNA inserts. After ccntrifugation of the BAC 
DNA in a cesium chloride gradient, ca. SQjug of BAC DNA arc purified. 5-10//U °f BAC DNA arc sonicated usinfl three 
distinct conditions, to obtain fragments within a desired size range. The obtained DNA fragments are end-repaired in a 
50 jji volume with two units of Vent polymerase for 20 min at 70° C, in the presence of the four deoxy triphosphates 
(100/sM). The resulting blunt-Bnded fragments ore separated by electrophoresis on preparative low-melting point 1% 
agarose gels ISO Volts for 3 hours). The fragments lying within a desired size range, such as 600 to 6,000 bp, are 
excised from the gel and treated with agarose. After chloroform extraction and dialysis on Microcon 100 columns, ON A 
in solution is adjusted to a 100 nql/A concentration. A ligation to a linearised, dephosphorylated, blunt-ended plasmid 
cloning vector is performed overnight by adding 100 ng of BAC fragmented DMA to 20 ng of pBluescfipt I! Sk (+) vector 
DNA linearized by enzymatic digestion, and treating with alkaline phosphatase. The ligation reaction is performed in a 
1 0 //I final volume in the presence of 40 units///l T4 DNA figase (Epicentre). The ligated products arc clectroporated into 
the appropriate cells (HcctroMAX Exoli DH10B cells). IPTG and X-gal are added to the cell mixture, which is then 
spread on the surface of an ampicHfln-contaimng agar plate. After overnight incubation at 37 Q C, recombinant (white) 
colonies are randomly picked and arrayed in 96 wetimicroplatcs for storage and sequencing. 

Alternatively, BAC subdoning may be performed using vectors which possess both a high copy number origin 
of replication, which facilitates the isolation of the vector DNA. and a low copy number origin of replication. Cloning of 
a genomic DNA fragment into the high copy number origin of replication inactivates the origin such that clones 
containing a genomic insert replicate at low copy number. The low copy number of clones having a genomic insert 
therein permits the inserts to be stably maintained. In addition, selection procedures may be designed which enable low 
copy number plasmids (Le. vectors having genomic inserts therein) to be selected. In a preferred embodiment, BAC 
subdoning win be performed in vectors having the above described features and moreover enabling high throughput 
sequencing of long fragments of genomic DNA. Such high throughput high quality sequencing may be obtained after 
generating successive deletions within the subdoncd fragments to be sequenced, using transposition-based or enzymatic 
systems. Such vectors are described in the U.S. Patent Application entitled ''High Throughput DNA Sequencing Vector* 
(GENSET.0 1 5A, Serial No. 09/0 58,746), the disclosure of which is incorporated herein by reference. 

It will be appreciated that other subdoning methods familiar to those skilled in the art may also be employed. 

The resulting subclones are then partially sequenced using, for example, the procedures described below, 
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Partial sequencing of BAC subclones 
The genomic DNA inserts in the subclones, such as the BAC subclones prepared above, are amplified by 
conducting PCR reactions on the overnight bacteria! cultures, using primers complementary to vector sequences flanking 
the insertions. 

The sequences of the insert extremities (on average 500 liases at each end. obtained under routine sequencing 
conditions) are determined by fluorescent automated sequencing on ABI 377 sequencers, using ABI Pi ism DNA 
Sequencing Analysis software. Following gel image analysis and DNA sequence extraction, sequence dat3 are 
automatically processed with adequate software to assess sequence quality. A proprietary base-caller, automatically 
flags suspect peaks, taking into account the shape of the peaks, the inter-peak resolution, and the nnise level. The 
proprietary base-caller also performs an automatic trimming. Any stretch of 25 or fewer bases having mora than 4 suspect 
peaks is usually considered unreliable and is discarded. 

The sequenced regions of the subclones, such as the BAC subclones prepared above, are then analyzed in 
order to identify biallelic markers lying therein. The frequency at which biallofic markers will be detected in the 
screening process varies with the average level of heterozygosity desired. For example, if biallelic markers having an 
average heterozygosity rate of greater than 0.42 arc desired, they will occur every Z5 to 3 kb on average. Therefore, 
on average, six 500 bp-genomic fragments have to be screened in order to derive 1 biallelic marker having an adequate 
informative content. 

As a preferred alternative to sequencing the ends of an adequate number of DAC subclones, the above 
mentioned high throughput deletion-based sequencing vectors, which allow the generation of a high quality sequence 
information covering fragments of ca. 6kb, may be used Having sequence fragments longer than 2.5 or 3kb enhances 
tho chances of identifying biallelic markers therein. M8thods of constructing and sequencing a nested set of deletions 
are disclosed in the U.S. Patent Application entitled 'High Throughput DNA Sequencing Vector* (GENSET.015A, Serial 
No. 09/058.746), the disclosure of which is incorporated harem by reference. 

To identify biallelic markers using partial sequence information derived from subclone ends, 
such as the ends of the BAC subclones prepared above, pairs of primers, each one specifically 
defining a SOO bp amplification fragment, are designed using the above mentioned partial sequences. 
The primers used for the genomic amplification of fragments derived from the subclones, such as 
the BAC subclones prepared above, may be designed using the OSP software (Hillier L. and Green 
P., Methods Appl> 1:124-8 (1991), the disclosure of which is incorporated herein by reference). The 
GC content of the amplification primers preferably ranges between 10 and 75 %, more preferably 
between 35 and 60 %, and most preferably between 40 and 55 %. The length of amplification 
primers can range from 10 to 100 nucleotides, preferably from 10 to 50, 10 to 30 or more preferably 
10 to 20 nucleotides. Shorter primers tend to lack specificity for a target nucleic acid sequence and 
generally require cooler temperatures to form sufficiently stable hybrid complexes with the 



WO 99/04038 



PCT/IB98/01193 



-22- 

template. Longer primers are expensive to produce and can sometimes self-hybridize to form hairpin 
structures. 

All primers may contain, upstream of the specific target bases, a common oligonucleotide tail that serves as a 
sequencing primer. Those skilled in the an arc familiar with primer extensions which can be used for these purposes. 

To identify biallelic markers, the sequences corresponding to the partial sequences determined above are 
determined and compared in a plurality of individuals. The population used to identify biallelic markers having an 
adequate informative content preferably consists of ca. 100 unrelated individuals from a heterogeneous population. 

First, DMA is extracted from the peripheral venous blood of each donor using methods such as thoso described 
in Example 5. 

Examples 
Extraction of DMA 

30 ml of blood are taken from the individuals in the presence of EDTA. Cells (pellet) arc collected after 
centrifugation for 10 minutes at 200Q rpm. Red cells are ryscd by a lysis solution (50 ml final volume : 10 mM Tris 
pH7.6: 5 mM MgCI 2 ; 10 mM NaCIL The solution is centrifuged (10 minutes. 2000 rpm) as many times as necessary to 
eliminate the residual red cells present in the supernatant, after resuspension of the pellet in the lysis solution. 

The pellet of white cells is lysed overnight at 42°C with 3.7 ml of lysis solution composed of: 
• 3 ml TE 1 0-2 (Tris-HCI 10 mM t EDTA 2 mM) / NaCI 0.4 M 
•200//ISDS10* 

- 500 K proteinaso (2 mg K-protcinase in TE 1 0-2 / NaCI 0.4 M). 

For the extraction of proteins, 1 ml saturated NaCI (6M) (1/3.5 v/v) is added. After vigorous agitation, the 
solution is centrifuged for 20 minutes at 1 0000 rpm. 

For the precipitation of DNA, 2 to 3 volumes of 100% ethancl are added to the previous supernatant, and the solution is 
centrifuged for 30 minutes at 2000 rpm. The DNA solution is rinsed three times with 70% ethanol to eliminate salts, 
and centrifuged for 20 minutes at 2000 rpm. The pellet is dried at 37°C, and resuspended in 1 ml TE 10-1 or 1 ml 
water. The 0NA concentration 13 evaluated by measuring the 0D at 260 nm {1 unit 00 - 50//g/ml DNA). 

To evaluate the presence of proteins m the DNA solution, the OD 260 / OD 280 ratio is determined. Only DNA 
preparations having a OD 260 / 00 280 ratio between 1.B and 2 are used in the subsequent steps described below. 

Onca genomic DNA from every individual in the given population has been extracted, it is preferred that a 
fraction of each DNA sample is separated, after which a pool of DNA is constituted by assembling equivalent DNA 
amounts of the separated fractions into a single one. 

Second, the DNA obtained from peripheral blood as described above is amplified using the above mentioned 
amplification primers. 

Example 6 provides procedures that may be used in the amplification reactions, and the detection of 
polymorphisms within the obtained araplicons. 
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Example 6 

Amplification of DNA frnm Peripheral Blood 
and Identification of Biallnfic Markers 
The amplification of each sequence is performed on pooled DNA samples obtained as in Example 5 above, using 



PCR (Polymerase Chain Reaction) as follows: 

• final volume 25 /j\ 

• genomic DNA 2 ngi//I 
-MgPz 2mM 
-dl\TTP(cach) 200 ptA 
- primer (each) 2,9 ng///l 

• Ampli Taq Gold DNA polymerase (Perkin) Q.05unit//il 



- PCR buffer (10X- 0.1 M Tris HCI pH 8.3. 0.5 M KCI) 1X. 

The synthesis of primers is performed following the phosphoramidite method, on a 
GENSET UFPS 24.1 synthesizer. 

To reduce the expense of preparing amplification primers for use in the above procedures, short primers may be 
used. While primers and probes having between 15 and 20 (or more) nucleotides are usually highly specific to a given 
nucleic acid sequence, it may be inconvenient and expensive to synthesize a relatively long oligonucleotide for each 
analysis. In order to at least partially circumvent this problem, it is often possible to use smaller but still relatively 
specific oligonucleotides that are shorter in length to create a manageable library. For example, a library of 
oligonucleotides comprising about 8 to 10 nucleotides is conceivable and lias already been used for sequencing of a 
40,000 bp cosmid DNA (Studior r Proc. NatL Acad. ScJ. USA 86(18):6917-6921 (1989), the disclosure of which is 
incorporated herein by reference). 

Another potential way to obtain specific primers and probes with a small library of oligonucleotides is to 
generate longer, more specific primers and probes from combinations of shorter, less specific oligonucleotides. Libraries 
of shorter oligonucleotides, each one being from about five to eight nucleotides in length, have already been used 
(Kieleczawa et at.. Science 258:1787-1791 (1992); Kotler et aL, Proc Nod Acad. Scl USA 90:4241-4245 (1993); 
Kaczorowski and Zv(ba\s\^ Ana! Bfochem. 22W27-ra& (1994),nhrtfisctosures-of w hich ar e -i n c orp o rat ed h e rein-by- 
reference). Suitable probes and primers of appropriate length can therefore be designed through the association of two 
or three shorter oligonucleotides to constitute modular primers. The association between primers can be either covalent 
resulting from the activity of DNA T4 ligase or non-covalent through bass-stacking energy. 

The amplification is performed on a Perkin Elmer 9600 Thermocycler or MJ Research PTC200 with heating lid. 
After heating at 95° C for 10 minutes, 40 cycles are performed. Each cycle comprises: 30 sec at 95° C, 1 minute at 
54°C, and 30 sec at 72°C. For final elongation, 10 minutes at 72°C ends the amplification. 

The quantities of the amplification products obtained are determined on 96-well microliter plates, using a 
fluorimeter and Picogreen as intercalating agent (Molecular Probes). 
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The sequences of the amplification products are determined using automated dideoxy terminator sequencing 
reactions with a dye-p rimer cycle sequencing protocol. The products of the sequencing reactions are run on sequencing 
gels and the sequences are determined using gel image analysis. 

The sequence data are evaluated using software designed to detect the presence of biallelic sites among the 
pooled amplified fragments. The polymorphism search is based on the presence of superimposed peaks in the 
electrophoresis pattern resulting from different bases occurring at the same position. Because each dideoxy terminator 
is labeled with a different fluorescent molecule, the two peaks corresponding to a biallelic site present distinct colors 
corresponding to two different nucleotides at the same position on the sequence. The software evaluates the intensity 
ratio between the two peaks and the intensity ratio between a given peak and surrounding peaks of the same color. 

However* the presence of two peaks can be en artifact due to background noise. To exclude such on artifact, 
the two DNA strands are sequenced arid a comparison between the peaks is carried out. In order to be registered as a 
polymorphic sequence, the polymorphism has to be detected on both strands. 

The above procedure permits those amplification products which contain biallelic markers to be identified. 
The detection fimh for the frequency of biallelic polymorphisms detected by sequencing pools of 100 
individuals is about 10% for the minor allele, as verified by sequencing pools of known allelic frequencies. However, 
more than 90% of the biallelic polymorphisms detected by the pooling method have a frequency for the minor allele 
higher than 25%. Therefore, the biallelic markers selected by this method have a frequency of at least 10% for the minor 
allele and 90% or less for the major allele, preferably at feast 20% for the minor allele and 60% or less for the major 
allele, more preferably at least 30% for the minor allele and 70% or less for the major allele, thus a heterozygosity rate 
higher than 0.16, preferahly higher than 0.32, more preferably higher than 0.42. 

In an initial study to determine the frequency of biallelic markers in the human genome that can be obtained 
using the above methods the following results were obtained. 300 different amplicons derived from 100 individuals, and 
covering a total of 150 kb obtained from different genomic regions, were sequenced. A total of 54 biallelic 
polymorphisms were identified, indicating that there is one biallelic polymorphism with a heterozygosity rate higher than 
0.18 (frequency of the minor allele higher than 10%), preferably higher than 0.38 (frequency of the minor allele higher 
than 25%), every Z5 to 3 kb. Given that the human genome is about 3.10 s kb long, this indicates that, out of the 10 7 
biallelic markers present on the human genome, approximately 10 s have adequate heterozygosity rates for genetic 
mapping purposes. 

Using the procedures of Examples 1-6, sets containing increasing numbers of biallelic markers may be 
constructed. For example, the procedures of Examples 1-6 are used to identify 1 to about 50 biallelic markers. In some 
embodiments, the procedures of Examples 1*6 are used to identify about 50 to about 200 biallelic markers, (n other 
embodiments, the procedures of Examples 1-6 are used to identify about 200 to about 500 biallelic markers. In some 
embodiments, the procedures of Examples 1-6 are used to identify about 1,000 biallelic markers. In other embodiments, 
the procedures of Examples VB are used to identify about 3,000 biallelic markers. In further embodiments, the 
procedures of Examples 1-6 are used to identify about 5,000 biallelic markers. In another embodiment, the procedures 
of Examples 1-6 are used to identify about 10,000 biallelic markers. In still another embodiment, the procedures of 



WO 99/04038 



PCT/IB98/01193 



•25- 

Examples 1-6 are used to identify about 20,000 biallefic markers. In still another embodiment, the procedures of 
Examples 1-6 are used to identify about 40,000 biallelic markers. In still another embodiment, the procedures of 
Examples 1-6 are used to identify about 60,000, biallefic markers. In still another embodiment the procedures of 
Examples 1-6 are used to identify about 80,000 biallefic markers. In a still another embodiment, the procedures of 
Examples 1-6 are used to identify more than 100,000 biaflclic markers. In a fuTlher embodiment, the procedures of 
Examples 1-8 arc used to identify more than 120,000 bialfelic markers. 

As discussed above, the ordered nudeic acids, such as the inserts in BAC dunes, which contain the biallelic 
markers of the present invention may span a portion of the genome. For example, the ordered nucleic acids may span at 
least 1QOkb of contiguous genomic DMA, at least 250kb of contiguous genomic DMA, at least SOOkb of contiguous 
genomic DNA, at least 2Mb of contiguous genomic DMA, at least 5Mb of contiguous genomic DNA, at least 10Mb of 
contiguous genomic DNA, or at least 20Mb of contiguous genomic DNA. 

In addition, groups of hiallelic markers located in proximity to one another along the genume may ba identified 
within these portions of the genome for use in haplotyping analyses as described below. The biallelic markers included 
in each of these groups may be located within a genomic region spanning Jess than Ikb, from 1 to 5kb. from 5 to lOkb, 
from 10 to 25kb, from 25 to 50kb, from 50 to 150kb, from 150 to 250kb, from 250 to 500kb. from SOOkb to 1Mb, or 
more than 1Mb, It wfll be appreciated that the ordered DNA fragments containing these groups of biallelic markers need not 
completely cover the genomic regions of ihcse lengths but may instead be incomplete contigs having one or more gaps 
therein. As discussed in further detail below, biaflclic markers may be used in single maker and haplotypc association 
analyses regardless of the completeness of the corresponding physical contig harboring them. 

Using the procedures above, 653 biallelic markers, each having two alleles, were identified using sequences 
obtained from BACs which had been localized on the genome. In some cases, markers were identified using pooled B ACs 
and thereafter reassigned to individual BACs using STS screening procedures such as those described in Examples 2 and 
7. The sequences of 50 of these 653 biaOefic markers are provided in the accompanying Sequence Listing as SEQ ID 
Nos. 1-50 and 51-100 (with SEQ ID Nos. 1-50 being one allele of these 50 biallelic markers and SEO ID Nos. 51-100 
being the other allele of these 50 bialleGc markers}. Although the sequences of SEQ ID Nos. 1-50 and 51-100 will be 
used as exemplary markers throughout the present application, it will be appreciated that the biallelic markers used in 
the maps of the present invention are not limited to these particular markers, nor are they limited to having the exact 
flanking sequences surrounding the polymorphic bases which are enumerated hi SEQ ID Nos. 1-50 and 51-100 Rather, 
it will be appreciated that the flanking sequences surrounding the polymorphic bases of SEQ ID Nos. 1*50 and 5M00 
may be lengthened or shortened to any extent compatible with their intended use and the present invention specifically 
contemplates such sequences. The sequences of these 653 biallelic markers, including the sequences of SEQ ID Nos. 1- 
50 and 51-100 may be used to construct the maps of the present invention as well as in the gene identification and 
diagnostic techniques described herein. It will be appreciated that the biallefic markers referred to herein may be of any 
length compatible with their intended use provided that the markers include the polymorphic base, and the present 
invention specifically contemplates such sequences. 
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Ordering of bialtefic markers 
BialleGc markers can be ordered to determine, their positions along chromosomes, preferably subchromosomal 
regions, most preferably along, the above described minimally overlapping ordered BAC arrays, as follows. 

The positions of the biallcfic markers along chromosomes may be determined using a variety of methodologies. 
In one approach, radiation hybrid mapping is used. Radiation hybrid {Rill mapping is a somatic cell genetic approach that 
can be used for high resolution mapping of the human genoma. In this approach, cell fines containing oris or mnre human 
chromosomes arc IcthaDy irradiated, breaking each chromosome into fragments whose size depends on the radiation dose. 
These fragments an? rescued by fusion with cultured rodent cells, yielding subclones containing different portions of the 
human genome. This technique is described by Benham et af. {Genomics 4:509-517. 1989) and Cox et aL, [Science 
250:245*250, 1990), the entire contents of which are hereby incorporated by reference- The random 3ml independent 
nature of the subdoncs permits efficient mapping of any human genome marker. Human DNA isolated from a panel of 80* 
100 cell lines provides a mapping reagent for ordering biallofic markers, In this approach, the frequency of breakage 
between markers is used to measure distance, allowing construction of fine resolution maps as has been done for ESTs 
(Schiller et aL, Science 274:540-540, 1996, hereby incorporated by reference), 

(III mapping has been used to generate a high-resolution whole genome radiation hybrid map of human 
chromosome 17q22-q25.3 across the rjemis for growth hormone (GIf) and thymidine kinase (TK) (Foster et aL, Genomics 
33:185-192. 1996), the region surrounding the Gorlin syndrome gene (Obermayr et aL, Eur. J. Hum. Genet. 4:242-245, 
1995), GO loci covering the entire short arm trf chromosome 12 (Raeymaekers et a!.. Genomics 29:170-178, 1995), the 
region of human chromosome 22 containing the neurofibromatosis type 2 locus (Frazer et aL, Genomics 14:574*504, 1992) 
and 13 loci on the long arm of chromosome 5 (Warrington et aL, Genomics 1 1:701-708, 1991). 

Alternatively, PCR based techniques and human-rodent somatic cell hybrids may be used to determine the 
positions of the biallelic markers on the chromosomes. In such approaches, oligonucleotide primer pairs which are capable of 
generating amplification products containing the polymorphic bases of the biallelic markers are designed. Preferably, the 
oligonucleotide primers are 16-23 bp in length and are designed for PCR amplification. The creation of PCR primers from 
known sequences is well known to those with skill in the art For a review of PCR technology see Erficti HA, PCR 
Technology; Principles and Applications for DMA Amplification. 1992. W.H. Freeman and Co- r New York. 

The primers are used in polymerase chain reactions (PCR) to amplify templates from total human genomic DMA. 
PCR conditions are as follows: 60 ng of genomic DNA is used as a template for PCR with 80 ng of each oligonucleotide 
primer, 0.6 unit of Taq polymerase, and 1 jiCu of a ^P-labeled deoxycytidine triphosphate. The PCR is performed in a 
microplate thermocycter (Techne) under the following conditions: 30 cycles of 94°C, 1.4 mm; 55°C, 2 mire and 72°C, 2 min; 
with a final extension at 72°C for 10 min. The amplified products are analyzed on a 6% poly aery lamide sequencing gel and 
visuaEzed by autoradiography. H the length of the resulting PCR product is identical to the length expected for an 
amplification product containing the polymorphic base of the biallelic marker, then the PCR reaction is repeated with DNA 
templates tram two panels of human-rodent somatic xell hybrids, BIOS PCRable DNA (BIOS Corporation) and NIGMS 
HumarvRodent Somatic Cell Hybrid Mapping Panel Number 1 (NIGMS, Camden, N J). 
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PGR is used to screen a series of somatic cell hybrid cell lines containing defined sets of human chromosomes for 
the presence of a given bialleiic marker. DNA is isolated from the somatic hybrids and used as starting templates for PCfl 
reactions using the primer pairs from the biaDcBc marker. Only those somatic cell hybrids with chromosomes containing the 
human sequence corresponding to flu bialleiic marker wll yield an amplified fragment. The binflclic markers are assigned to 
a chromosome by analysis of the segregation patient of PCR products from the somatic hybrid DNA templates. Hie single 
human chromosome present in all cell hybrids that give rise to an amplified fragment is the cfirnmosome containing that 
biaHefic marker. For a review of techniques and Analysis of results from somatic cell gene mapping experiments. (See 
Ledbettcret al. f Genomics 6:475-481 (1990).} 

Example 7 describes a preferred method for positioning of bialleiic markers on clones, such as BAC clones, 
obtained from genomic DNA libraries. 

Example 7 

Screening BAC libraries with biellelic markers 
Amplification primers enabling the specific amplification of DNA fragments carrying the biallefic markers (including 
the 653 bialleiic markers obtained above (which include the sequences of SEQ ID Nos 1-50 and 5M00) may be used to 
screen clones in any genomic DNA library, preferably the BAC libraries described above for the presence of the bialleiic 
markers. 

Pairs of primers were designed which allowed the amplification of fragments carrying the 653 hiallcGc markers 
obtained above. The amplification primers may be used to screen dunes in a genomic DNA library for the presence of the 
B53 bialleiic markers. For example, pairs of amplification primers of SEQ ID Nos. 101-150 and 151-200 may be used to 
amplify fragments which include the polymorphic bases of the bialleiic markers of SEQ ID Nos. 1-50 and 5MO0. 

It will be appreciated that amplification primers for tho bialleiic markers may be any sequences which allow the 
specific amplification of any DNA fragment carryfog tlie markers and may be designed using techniques familiar to those 
skilled in the art The amplification primers may be oligonucleotides of 8. 10, 15, 20 or more bases in length which 
enable the amplification of any fragment carrying the polymorphic site in the markers. The polymorphic base may be in 
the center of the amplification product or, alternatively, it may be located off-center. For example, in some 
embodiments, the amplification product produced using these primers may be at least 100 bases in length (Le. 50 
nucleotides on each side of the polymorphic base in amplification products in which the polymorphic base is centrally ' 
located). In other embodiments, the amplification product produced using these primers may be at least 500 bases in 
length (le. 250 nucleotides on each side of the polymorphic base in amplification products in which the polymorphic base 
is centrally located). In still further embodiments, the amplification product produced using these primers may be at 
least 1000 bases in length (le. 500 nucleotides on each side of the polymorphic base in amplification products in which 
the polymorphic base is centrally located). Amplification primers such as those described above are included within the 
scope of the present invention. 

The localization of bianelic markers on BAC clones is performed essentially as described in Example 2. 

The BAC clones to be screened are Distributed in three dimensional pools as described in Example 2- 
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Amplification reactions arc conducted on the pooled BAC clones using primers specific for the biallelic markers 
to identify BAC dones which contain the biaflolic markers, using procedures essentially similar to those described in 
Example 2. 

Amplification products resulting from the amplification reactions are detected by conventional agarose gel 
5 electrophoresis combined with automatic image capturing and processing. PCH screening for a bMolic marker involves 

three steps: (1} identifying the positive primary pools; (2) for each positive primary pools, identifying the positive plate, 
row 3nd column 'subpools' to obtain the address of the positive done; {3} directly confirming the PCH assay on the 
identified clone. PCR assays are performed with primers defining the biallclic marker, 

Soeening is conducted as follows. First BAC DNA is isolated as follows. Bacteria containing the genomic 
10 inserts arc grown overnight at 37*C in 120 pi of LB containing chloramphenicol (12 ^g/mO. DNA is extracted by the 

following protocol: 

Centrifuge 1 0 min at 4°C and 2000 rpm 

Eliminate supernatant and resuspend peJIet in 1 20 p\ TE 1 0 2 (Tris I ICl 1 0 mM, EDTA 2 mM} 
Centrifuge 10 min at 4°C and 2000 rpm 
15 Eliminate supernatant and incubate pellet with 20 j/l lyzozyme 1 mg/ml during 1 5 min at room temperature 

Add 20 {A proteinase K 100//g/rnl and incubate 15 min at BO°C 
Add 8 p\ DNAse 2U///1 and incubate 1 hr at room temperature 
Add 100//! TE 10-2 and keep at -80°C 



20 PCR assays are performed using the following protocol: 

Final volume 15^/1 

BAC DNA 1.7ng///l 

MgCl 2 2mM 

dNTP(each) 200 //M 

25 primer (each) 2,9 ngj//! 

AmpfiTaq Gold DNA polymrase 0.05 unit///! 
PCR buffer {1(k - 0.1 M TrisHCI pH8.3 0.5M KCI lx 



The amplification is performed on a Genius H thermocyder. After heating at 95°C for 10 min, 40 cycles are 
30 performed. Each cycle comprises: 30 sec at 95*C, 54°C for 1 min, and 30 sec at 72°C. For final elongation, 1 0 min at 

72*C end the amplification. PCR products are analyzed on 1% agarose gel with 0.1 mg/ml ethidium bromide. 

Using such procedures, □ number of BAC clones carrying selected bialfciic markers can be isolated. The 
position of these BAC dones on the human genome can be defined by performing STS screening as described in Example 
2. Preferably, to decrease the number of STSs to be tested, each BAC can be localized on chromosomal or 
35 subchromosomal regions by procedures such as those described in Examples 8 and 9 below. This localization will allow 

the selection of a subset of STSs corresponding to the identified chromosomal or subchromosomal region. Testing each 
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BAC with such a subset of STSs and taking account of the position and order of the STSs along the genome will allow a 
refined positioning of the corresponding Irialtelic marker along the genome. 

In oilier embodiments, if the DNA library used to isolate BAG inserts or any type of genomic DNA fragments 
harboring the selected bialfclic markers already constitute a physical map of tiie genome or any portion thereof, using the 
5 known order of the DNA fragments will allow the order of the biallelic markers to be established. 

As discussed above, ft will be appreciated that markers carried by the same fragment of genomic DNA, such as 
the insert in a BAG clone, need not necessarily be ordered with respect to one another within the genomic fragment to 
conduct single point or haplotype association analyses. However, in other embodiments of the present maps, the urder of 
biaOeiic markers carried by the same fragment of genomic DNA may be determined. 
10 The positions of the biallelic markers used to construct the maps of the present invention, including the 653 

biaflelic markers obtained above, may be assigned to subchromosomai locations using Fluorescence In Situ Hybridization 
(FISH) (Cherif et aL, Proc Natl Acad Set USA.. 87:6639-6643 (1990), the disclosure of which is incorporated herein by 
reference). FISH analysis is described in Example 8 below. 

15 Example 8 

Assignment of Biallelic Markers to Subctiramasomal Regions 
Metaphase chromosomes ore prepared from phytohcmagglutinin (PI iA)- stimulated blood cell donors. PHA- 
stimulated lymphocytes from healthy males arc cultured for 72 h in RPMI-1 640 medium. For synchronization, methotrexate 
(10 nM) is added for 17 h, followed by addition of 5-bromodeoxyuridine (5-BudR. 0.1 mM) for 6 h. Colcemid (1 ^igfml) is 
20 added for the last 15 min before harvesting the cells. Cells are collected, washed in RPMI, incubated with a hypotonic 

solution of KCI {75 mM) at 37 Q C for 15 min and fixed in thee changes of methanokacetic acid (11 ). Tho cell suspension is 
dropped onto a glass slide and air-dried. 

BAC clones carrying the biallelic markers used to construct the maps of the present invention [including the 653 
biadefee markers obtained aboveto} can be isolated as described above. Those BACs or portions thereof, including fragments 
25 carrying said biallelic markers, obtained for example from amplification reactions using pairs of amplification primers as 

descried above, can be used as probes to be hybridized with metaphasic chromosomes. It will be appreciated that the 
hybridization probes to be used in the contemplated method may be generated using alternative methods well known to 
those skilled in the art Hybridization probes may have any length suitable for this intended purpose. 

Probes are then labeled with biotin-16 dUTP by nick translation according to the manufacturer's instructions 
30 (Bethasda Research Laboratories, Bethesde, MD), purified using a Sephadex G-50 column (Pharmacia, Upssala, Sweden) and 

precipitated. Just prior to hybridization, tho DNA pellet is dissolved in hybridization buffer (50% formamide, 2 X SSC. 10% 
dextran sulfate, 1 mg/ml sonicated saknon sperm DNA, pH 7) and the probe is denatured at 70*0 for 5-10 min. 

Slides kept at -20°C are treated (or 1 h at 37°C with RNase A (100 ftg/ml). rinsed three times in 2 X SSC and 
dehydrated in an ethanol series. Chromosome preparations are denatured in 70% formamide. 2 X SSC for 2 min at 70° C. 
35 then dehydrated at 4°C- The sOdes are treated with proteinase K (10 ^ig/100 ml in 20 mM Tris-HCI, 2 mM CaCt 2 ) at 37°C 
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for 8 min and dehydrated. The hybridization mixture containing the probe is placed on the slide, covered with a coversfip f 
sealed with rubber cemont and incubated overnight in a humid chamber at 37°C. After hybridization and posMiybridization 
washes, the biotinylatcd probe is detected by avidin-FiTC and amplified with additional layers of biotinylatcd goat anti-avidin 
and avidin-FlTC. For chromosomal localization, fluorescent B-bands are obtained as previously described {Chcrif et aL r (l 990) 
sup&X The slides are ohserved under a UEICA fluorescence microscope (DMRXA). Chromosomos are counters! aincd with 
propidium iodide and the fluorescent signal of the probe appears as two symmetrical yellow-green spots on both cluomatids 
cf the fluorescent R-band chromosome (red). Thus, a particular Linltelh: marker may be localised to a particular cytogenetic 
fl-band on a given chromosome. 

The above procedure was used to confirm the subchromosomal location of 95% of the BAC clones harboring the ^ 
653 markers obtained abovB. In particular, the 50 markers of SEQ ID Nos. 1-50 and 51-100 ware assigned to 
subchromosomal regions of chromosome 21. SimplB identification numbers were attributed to each BAC from which the 
markers are derived. Figure 1 is a cytogenetic map of chromosome 21 ircfcating tlie subchromosomal regions therein. Table 
1 Dsts the internal identification number of the localized biaflelic markers, the internal identification number of the BACs from 
which the markers were derived, the size of the BAC iisert, the average mtermarkcr distance in the BAC insert and the 
subchromosomal locations of the biaflefic markers- ITie sequences of the localized markers are provided as SEQ ID Nos. 1-50 
and 51-100 in the accompanying sequence listing. Amplification primers for generating amplification products containing 
the polymorphic bases of these markers are also provided as SEQ 10 Nos. 10M50 and 151-200 in tho accompanying 
sequence fisting. Mfcroscquenrjng primers for use in determining the identities of the polymorphic bases of these biaflelic 
markers are provided in the accompanying Sequence listing as SEQ IO Nos. 201-250 and 251-300. 

The rate at which biallelic markers may be assigned to subchromosomal regions may be enhanced through 
automation. For eiample, probe preparation may be pert ormed in a microtiter plate format, using adequate robots. The rate 
at which biaBelic markers may be assigned to subchromosomal regions may be enhanced using techniques which permit the 
in situ hybridization of multiple probes on a single microscope slide, such as those disclosed in Larin et aL, Nucleic Acids 
Research 22: 3689-3692 (1994), the disclosure of which is incorporated herein by reference. In the largest test format 
descried, different probes were hybridized simultaneously by applying them directly from a 96-well microther dish which 
was inverted on a glass plate. Software for image data acquisition and analysis that is adapted to each optical system, test 
format and fluorescent probe used, can be derived from the system described in Uchter et al Science 247: 64-69 (1990). 
the disclosure of which is incorporated herein by reference. Such software measures tho relative distance between the 
center of the floorescent spot corresponding to the hybridized probe and the telomeric end of the short arm of the 
corresponding chromosome, as compared to the total length of the chromosome. The rate at which biallelic markers are 
assigned to subchromosomal locations may be further enhanced by simultaneously applying probes labeled with different 
flouorescent tags to each weD of the 96 well dish, A further benefit of conducting the analysis on one slide is that it 
facilitates automation, since a microscope having a moving stage and the capability of detecting fluorescent signals in 
different metaphase chromosomes could provide the coordenates of each probe on the metaphasa chromosomes distributed 
on the 96 weD dish. 

Example 9 below describes an alternative method to position biallelic markers which allows their assignment to 
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Example 9 

Assignment of Rjallefic Markers lo Human ChromnsomBS 
Trie biallelic markers used to construct the maps of the present invention, including the 653 biallelic markers 
obtained above {which include tho sequences of SEQ ID Nos. 150 and 51-100), may be assigned to a human 
chromosome using monosoma! analysis as described below. 

The chromosomal localization of a biaOclic marker can be performed through the use of somatic cell hybrid 
panals. For example 24 panels, each panel containing a different human chromosome, may be used (Russell et aL, 
Smat CeUMol. Genet 22.-42&431 (1996); Drwinga et aL. Genomics 16:31 1-314 11993). the disclosures of which are 
incorporated herein by reference). 

The biallefic markers are localized as follows. The DNA of each somatic cell hybrid is extracted and purified. 
Genomic DNA samples from a somatic cell hybrid panel are prepared as follows. Cells arc lysed overnight at 42*0 with 
3.7 m! of lysis solution composed of: 

3 ml TE 1 0-2 (Tris NC1 10 mM. EDTA 2 mM) / NaCI 0.4 M 

200 //I SOS 10% 

500 fA K-proteinase (2 mg K proteinase in TE 10-2 / NaCI 0.4 M) 

For the extraction of proteins, 1 mi saturated NaCI (6M) (1/3.5 v/v) is added. After vigorous agitation, the 
solution is ccntrifuged for 20 min at 10,000 mm. For tha precipitation of DNA, 2 to 3 volumes of 100 % cthanol are 
added to the previous supernatant, and the solution is centrifuged for 30 min at 2,000 rpm. The DNA solution is rinsed 
three times with 70 % eihanol to eliminate salts, and centrifuged for 20 min at 2.000 rpm. The pellet is dried at 37°C, 
and resuspended in 1 ml TE 1(M or 1 ml water. The DNA concentration is evaluated by measuring the OD at 260 nm (1 
unit OD - 50 fjtfm\ ONA). To determine the presence of proteins in the DNA solution, the ODjro/ODjgo ratio is 
determined. Only DNA preparations having a OD IW /OD a ,> ratio between 1.8 and 2 are used in the PCR assay. 

Then, a PCR assay is performed on genomic DNA whh primers defining the biallelic marker. The PCR assay is 
performed as described above for BAC screening. The PCR products are analyzed on a 1% agarose gel containing 0.2 
rogfml sthidium bromide. 

The ordering analyses described above may be conducted to generate an integrated genome wide genetic map 
comprising about 20,000 biallelic markers (1 biallelic marker per BAC if 20.000 BAC inserts are screened). In some 
embodmtents. the map includes one or more of the 653 markers obtained above (which include the sequences of SED ID 
Nos. 1-50 and 51-100 or the sequences complementary thereto). 

In another embodiment, the above procedures are conducted to generate a map comprising about 40,000 
markers (an average of 2 biallelic markers per BAC if 20.000 BAC inserts are screened). In some embodiments, tha map 
includes one or more of the 653 markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 51-100 
or the sequences complementary thereto). 

In a further embodiment preferred embodiment, the above procedures are conducted to generate a map 
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comprising about 80,000 markers ( an average of 3 biallelic markers per BAC if 20 p 000 BAC inserts arc screened). In 
soma embodiments, the map includes one or more of the 653 markers obtained above (which include the sequences of 
SEQ ID Nos. 1-50 and 51-100 or the sequences complementary thereto). 

In a further embodiment preferred embodiment, the above procedures are conducted to generate a map 
comprising about 80,000 markers (an average of 4 bialtolic markers per BAC if 20,000 BAC inserts are screened). In 
some embodiments, the map includes one or more of the 653 markers obtained above (which include the sequences of 
SEQ ID Nos, 1-50 and 51-100 or the sequences complementary thereto). 

in yet another embodiment, the above procedures are conducted to generate a map comprising about 100.000 
markers (an average of 5 biallclic markers per BAC if 20,000 BAC inserts are screened). In some embodiments, the map 
includes one or more of the 653 markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 51 100 
or the sequences complementary thereto). 

In a further embodiment, the above procedures are conducted to generate a map comprising about 120,000 
markers (an average of 6 biallalic markers per BAC if 20.000 BAC inserts 3re screened). In some embodiments, the map 
includes one or more of the G53 markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 51-100 
or the sequences complementary thereto. 

Alternatively, maps having the above-specified average numbers of biallclic markers per BAC which comprise 
smaller portions of the genome, such as a set of chromosomes, a single chromosome, a particular subchromosomal 
region, or any other desired portion of the genome, may also be constructed using the procedures provided herein. 

In some embodiments, the biallelic markers in the map are separated from one another by an average distance 
of 10-200kb. In further embodiments, the biallelic markers in the map are separated from one another by an average 
distance of 1 5-1 50kb. In yet another embodiment, the biallelic markers in the map are separated from one another by an 
average distance of 20-1Q0kb. In other embodiments, the biallelic markers in the map are separated from one another 
by an average distance of 100*150kb. In further embodiments, the biallclic markers in the map are separated from one 
another by an average distance of 50-100kb. In yet another embodiment/the biallefic markers in the map are separated 
from one another by an average distance of 25-50kb. Maps having the above-specified intermarker distances which 
comprise smaller portions of the genome, such as a set of chromosomes, a single chromosome, a particular 
subchromosomal region, or any other desired portion of the genome, may also be constructed using the procedures 
provided herein. 

Figure 2, showing the results of computer simulations of the distribution of inter-marker spacing on a randomly 
distributed set of biallefic markers, indicates the percentage of biallelic markers which will be spaced a given distance 
apart for a given number of markers/BAC in the genomic map (assuming 20.000 BACs constituting a minimally 
overlapping array covering the entire genome are evaluated). One hundred iterations were performed for each 
simulation (20,000 marker map, 40,000 marker map, 60,000 marker map. 120,000 marker map). 

As illustrated in Figure 2a, 98% of inter-marker distances will be lower than 150kb provided 60,000 evenly 
distributed markers are generated (3 par BAC); 90% of inter-marker distances will be lower than 150kb provided 40,000 
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evenly distributed markers are generated (2 per 8ACJ; and 50% of inter-marker distances will be lower than 150kb 
provided 20,000 evenly distributed markers are generated (1 per BAC). 

As illustrated in figure 2b, QS% of inter-marker distances will be lower than 80kb provided 120,000 evenly 
distributed markers are generated (6 per BAC); 80% of inter-marker distances will be lower than BOkb provided 60,000 
5 evenly distributed markers are generated 13 per BAC); and 15% of inter-marker distances will be lower than 80kb 

provided 20,000 evenly distributed markers are generated {1 per BAC). 

As already mentioned, high density bialfelic marker maps allow association studies to be performed to identify 
genes involved in complex traits. 

Association studies examine the frequency of marker alleles in unrelated trait positive (T+) individuals 
10 compared with trait negative (T-) controls, and are generally employed in the detection of polygenic inheritance. 

Association studies as a method of mapping genetic traits rely on the phenomenon of linkage disequilibrium, 
which is described below. 

Linkage Disequilibrium 

'5 If two genetic loci lie on the S8me chromosome, then sets of alleles on the same chromosomal segment (called 

haplotypes) tend to be transmitted as a block from generation to generation. When not broken up by recombination, 
haplotypes can be tracked not only through pedigrees but also through populations. The resulting phenomenon at the 
population level is that the occurrence of pairs of specific elleles at different loci on the same chromosome is not 
random, and the deviation from random is called linkage disequilibrium (LD). 
20 If a specific allele in a given gene is directfy involved in causing a particular trait T, its frequency will be 

statistically increased in a T* population when compared to the frequency in a T- population. As a consequence of the 
existence of LD, the frequency of all other alleles present in the haplotype carrying the trait-causing allele (TCA) will also 
be increased in T+ individuals compared to T- individuals. Therefore, association between the trait and any allele in 
linkage disequilibrium with the trait-causing allele will suffice to suggest the presence of a trait-related gsne in that 
25 particular allele's region. Linkage disequilibrium allows the relative frequencies in T+ and T- populations of a limited 

number of genetic polymorphisms (specifically bialielic markers) to be analyzed as an alternative to screening all possible 
functional polymorphisms in order to find trait-causing alleles. 

The present invention then also concerns bialielic markers in linkage disequilibrium with the specific bialielic 
markers described above and which are expected to present similar characteristics in terms of their respective 
30 association with a given trait in a preferred embodiment, the present invention concerns the bialielic markers that are in 

linkage disequilibrium with the 6S3 bialielic markers obtainod above (which include the sequences of SEQ ID Nos. 1-50 
and 5M 00 or the sequences complementary thereto). 

LO among e set of biaDelic markers having an adequate heterozygosity rate can be determined by genotyping 
between 50 and 1000 unrelated individuals, preferably between 75 and 200, more preferably around 100. Genotyping a 
35 bialleCc marker consists of determining the specific allele carried by an individual at the given polymorphic base of the 

bialielic marker. Genotyping can be performed using similar methods as those described above for the generation of the 
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MailBlic markers, or using other genotyping methods such as those further described below. 

LO between any pair of biaflelic markers comprising at least one of the biallelic markers of the present 
invention (M^Mj) can bo calculated for every allele combination fM„,M il; M n ,M i2; M, 2 ,M n and M i2 ,M i2 ), according to the 
Piazza formula : 

AM tt ,Mjj- VG4 • V (94 + 03) (04 +02) , where: 

04 frequency of genotypes not having allele k at M 9 and not having allele J at M, 

03- - + - frequency of genotypes not having allele k at M, and having allele I at Mj 
62- + - - frequency of genotypes having allele k at M< and not having allele I at M i 



Linkage disequilibrium (LD) between pairs of biallelic markers {Mi, Mj) can also be calculated for every allele 
combination (Mi1,Mjt ; Mi1.Mj2 ; Mi2.N51 ; Mi2,Mj2) according to the maximum likelihood estimate (MLEI for delta (the 
composite linkage disequilibrium coefficient), as described by Weir (B.S. Weir. Genetic Data Analysis, (1996), Sinauer 
Ass. Eds r the disclosure of which is incorporated herein by reference). This formula allows linkage disequilibrium 
15 between alleles to be estimated when only genotype, and not haplotype. data are available. This LD composite test 

makes no assumption for random mating in the sampled population, and thus seems to be more appropriate than other 
LD tests for genotypic data. 

The skilled person will readily appreciate that other 10 calculation methods can be used without undue 
experimentation, 

20 Example 10 illustrates the measurement of LD between a pubficiy known biallefic marker, the 'ApoE Site A". 

located within the Alzheimer's related ApoE gene, and other biallelic markers randomly derived from the genomic region 
containing the ApoE gene. 

Example 10 
Measurement of Linkage Disequilibrium 
25 As originally reported by Strittmattor et aL and by Saunders et aL in 1993, the Apo E e4 allele is strongly 

associated with both late-onset familial and sporadic Alzheimer's disease (AD). (Saunders, A.M. Lancet 342: 71 0-71 1 
(1993) and Strittmater, W.J- et aL Proc. Natl. Acad. 5cl U.S.A. 90: 1977-1981 (1993), the disclosures of which are 
incorporated herein by reference). Tho 3 major isoforms of human ApoJipoprotein E (apoE2, -E3, and -E4), as identified by 
isoelectric focusing, are coded for by 3 alleles (e 2, 3, and 4). The e 2, e 3, and z 4 isoforms differ m amino acid 
30 sequence at 2 sites, residue 112 (called site A) and residue 158 (called site B). The ancestral isoform of the protein is 

Apo E3 ff which at sites A/B contains cysteine/arginine, while ApoE2 and -E4 contain cysteine/cysteine and 
arginine/erginine, respectively (Weisgraber. K.H. et aL r J, Biol Chem. 256; 9077-8083 (1981); Rail, S.C. et aL, Proc 
NatL Acad. ScL U.SA 79: 46964700 (1 982), the disclosures of which are incorporated herein by reference). 

Apo E e 4 is currentjy considered as a major susceptibility risk factor for AD development in individuals of 
35 different ethnic groups (specially 4n Caucasians and Japanese wmpared to Hispenics or African Americans), across all 
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ages between 40 and 90 years, and in both men and women, as reported recently in a study performed on 5930 AD 
patients and 8607 controls (Farrcr et A. JAMA 278:1349.1356 (1997L the disclosure of which is incorporated herein 
by reference). More specifically, the frequency of a C base coding for argrninc 1 12 at site A is significantly increased in 
AD patients. 

Although the mechanistic link between Apo E e 4 and neuronal degeneration characteristic of AD remains to be 
established, current hypotheses suggest that tho Apo E genotype may influence neuronal vulnerability by increasing the 
deposition and/or aggregation of the amyloid beta peptide io the brain or by indirectly reducing energy availably to 
neurons by promoting atherosclerosis. 

Using the methods of the present invention, biallelic markers that are in tho vicinity of the Apo E site A were 
generated and the association of one of their alleles with Alzheimer's disease was analyzed. An Apo E public marker 
(stSG94) was used to screen a human genome BAG library as previously described, A BAG, which gave a unique FISH 
hybridization signal on chromosomal region I9q13.2.3, the chromosomal region harboring the Apo E gene, was selected 
for finding biallelic markers in linkage disequilibrium with the Apo E gene as follows. 

This BAC contained an insert of 205 kb that was subcloned as previously described. Fifty BAG subclones were 
randomly selected and sequenced. Twenty five subclone sequences were selected and used to design twenty five pairs 
of PGR primers allowing 500 bp-ampitcons to be generated. These PCR primers were then used to amplify the 
corresponding genomic sequences in a pool of DNA from 100 unrelated individuals (blood donors of French origin) as 
already described. 

Amplification products from pooled DNA were sequenced and analyzed for the presence of biallelic 
polymorphisms, as already described. Five amplicons were shown to contain a polymorphic base in the pool of 100 
unrelated individuals, and therefore these polymorphisms were selected as random biallelic markers in the vicinity of the 
Apo E gene. The sequences of both alleles of these biallelic markers (99-344/439; 99-355/219; 99-359/308; 99- 
365/344 ; 99-366/274) correspond to SEQ 10 Nos: 301-305 and 307-311 (See the accompanying Sequence Listing and 
Table 10) . Corresponding pairs of amplification primers for generating ampEcons containing these biallelic markers can 
be chosen from those listed as SEQ ID Nos: 313-317 and 319-323. 

An additional pair of primers (SEO ID Nos: 318 and 324) was designed that allows amplification of the 
rjenomic fragment carrying the biallelic polymorphism corresponding to the ApoE marker (99-2452/54; CfT; The C allele 
is designated SEQ ID NO: 308 m the accompanying sequence listing, while the T allele is designated SEQ ID NO: 312 in 
the accompanying Sequence Listing; (See also Table 10), publicly known as Apo £ site A (Weisgrabcr et a!. (1981), 
svpnn Ran at aL (1982) r supn» to be amplified. 

The five random biallelic markers phis the Apo E site A marker were physically ordered by PCR screening of the 
corresponding amplicons using all available BACs originally selected from the genomic DNA libraries, es previously 
described, using the public Apo E marker stSG94. The amplicon's order derived from this BAC screening is as follows: 

(99-344/99-366) - (99-365/99-2452) - 99-359 - 99-355. 
where brackets indicate that the exact order of the respective amplicons couldn't be established. 

Linkage disequilibrium among the six biallelic markers (five random markers plus the Apo E site A) was 
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determined by genotyping the seme 100 unrelated individuals from whom the random hiallelic markers were identified. 

DMA samples and amplification products from genomic PCR were obtained in similar conditions as those 
described above for the generation of biaHelfc markers, and subjected to automated microsequencing reactions using 
fluorescent ddNTPs (specific fluorescence for each ddNTP) and the appropriate microsequencing primers having a 3' end 
immediately upstream of the polymorphic base in the bialtefic markers. The sequence of these microsoquencing primers is 
indicated witmn the corresponding sequence listings of SCO ID Nos: 325-330. Once specifically extended at the 3' end 
by a DNA polymerase using the complementary fluorescent didcoxynucleotide analog (thermal cycling), the 
mkroscquoncing primer was precipitated to remove the unincorporated fluorescent ddNTPs. The reaction products were 
analyzed by electrophoresis on ADI 377 sequencing machines. Results were automatically analyzed by an appropriate 
software further described in Example 13. 

Linkage disequilibrium (ID) between all pairs of bialtelic markers (Mi. Mj) was calculated for every allele 
combination (MiLMjl ; Mi1,M]2 ; Mi2,Mj1 ; Mi2,Mj2) according to the maximum likelihood estimate (MLEj for delta (the 
composite linkage disequilibrium coefficient). The results of tho LD analysis between the Apo E Site A marker and the 
five new biaUclic markers (99-344/439 ; 99 355/219 ; 99-359/308 ; 99 3G5/344 ; 99-366/274) are summarized in Table 
2 below : 



Table 2 

Marksrs d x 100 SEQ ID Nos of tho SEQ ID Nos of the 

biallotic Markers amplification Printers 





ApoE SrtoA 


306 


318 




99-2452/54 


312 


324 


99-344/439 


1 


301 


313 






307 


319 


99-366/274 


1 


305 


317 






311 


323 


99-365/344 


8 


304 


318 






310 


322 


99-359/308 


2 


303 


315 






309 


321 


99-355/219 


1 


302 


314 






308 


320 



The above ID results indicate that among the five biallelic markers randomly selected in a region of about 200 
kb containing the Apo E gene, marker 99-365/344T is m relatively strong linkage disequilibrium with the Apo E site A 
allele (93-2452/540. 
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Therefore, since the Apo E site A allele is associated with Alzheimer's disease, one can predict that the T allele 
of marker 89-365/344 will probably be found associated with AD. In order to test this hypothesis, the biailelic markers 
of SEQ ID Nos : 301-306 and 307-312 were used in association studies as described below. 

225 Alzheimer's disease patients were recruited according to clinical inclusion criteria based on the MMSE 
test. The 248 control cases included in this study were both ethnically- and age-matched to the affected cases. Both 
affected and control individuals corresponded to unrelated cases. The identities of the polymorphic bases of each of the 
biailelic markers was determined in each of these individuals using the methods described above. Techniques for 
conducting association studies are further described below. 

The results of this study are summarized in Table 3 below : 



Table 3 



MARKER 



ASSOCIATION DATA 



15 



20 



Difference in allele frequency 
between individuals with Alzheimer's 
and control individuals 



99-344/439 
99-366/274 
93-365/344 
99-2452/54 (ApoE Site A) 
99-359/308 
99-355/219 



3.3% 
1.6% 
17.7% 
23.8% 
0.4% 
Z5% 



Corresponding p-valua 



9.54 E-02 
2.09 E-01 
6.9 E-10 
3.95 E-21 
9.2 E-01 
Z54 E-01 



25 The frequency of the Apo 6 site A allele in both AD cases and controls was found in agreement with that 

previously reported (ca. 10% in controls and ca. 34% in AD cases, leading to a 24% difference in allele frequency), thus 
validating the Apo E e4 association in the populations used for this study. 

Moreover, as predicted from the 10 analysis (Table 2), a significant association of the T allele of marker 99- 
365/344 with AD cases (18% increase in the T allele frequency in AD cases compared to controls, p value for this 
30 difference - 6.9 E-10) was observed. 

The above results indicate that any marker in LD with one given marker associated with a trait will be 
associated with the trait It will be appreciated that r though in this case the ApoE Site A marker is the trait-causing 
allele (TCA) itself, the same conclusion could be drawn with any other non TCA marker associated with the studied trait. 
These results further indicate that conducting association studies with a set of biailelic markers randomly 
35 B fin & r ated within a candidate region at a sufficient density (here about one bialleGc marker every 40kh on average). 
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allows the identification of at least one marker associated with (he trait 

In addition, these results correlata with the physical order of the six biallelic markers contemplated within the 
present example (see above) : marker 99-3B5/344, which had been found to be the closest in terms of physical distance 
to the ApoE Site A marker, also shows the strongest ID with the Apo E site A marker. 

In order to further refine the relationship between physical distance end linkage disequilibrium between biallolic 
markers, a ca, 450 kb fragment from a genomic region on chromosome 8 was fully sequenced 

LD within ca. 230 pairs of biallelic markers derived therefrom was measured in a random French population 
and analyzed as a function of the known physical intcrmarker spacing. This analysis confirmed that on average, LD 
between 2 biallelic markers correlates with the physical distance that separates them. It further indicated that LD 
between 2 biallelic markers tends to decrease when their spacing increases. More particularly, LD between 2 biallelic 
markers tends to decrease when their inter-marker distance is greater than 5Qkb, and is further decreased when the 
inter-marker distance is greater th3n 75kb. ft was further observed that when 2 biallelic markers were further than 
150kb apart, most often no significant LD between them could be evidenced. It will be appreciated that the si2e and 
history of the sample population used to measure LD between markers may influence the distance beyond which LD 
tends not to be detectable. 

Assuming that LD can be measured between markers spanning regions up to an average of 150kb long, biallelic 
marker maps win allow genome-wide LD mapping, provided they have an average inter-marker distance lower than 
150kb. 

Genome-wide LD mapping aims at identifying, for any TCA being searched, at least one biallelic marker in ID 
with said TCA. Preferably, in order to enhance the power of LD maps, in some embodiments, the biallelic markers therein 
have average inter-marker distances of 150kb or less, 75 kb or less, or 50 kb or less, 30kb or less, or 25kb or less to 
accommodate the fact that, in some regions of the genome, the detection of LD requires lower inter-marker distances. 

The present invention provides methods to generate biallelic marker maps with average inter-marker distances 
of 150kb or less, tn some embodiments, the mean distance between biallelic markers constituting the high density map 
will be toss than 75kb, preferably less than 50kb. Further preferred maps according to the present invention contain 
markers that are less than 37.5kb apart In highly preferred embodiments, the average inter-marker spacing for the 
biallelic markers constituting very high density maps is less than 30kb f most preferably less than 25kb. 

Genetic map$ containing biallelic markers (including the 653 biallelic markers obtained above, which include the 
sequences of SEQ ID Nos. 1-50 and 5M00 or the sequences complementary thereto) may be used to identify and 
isolate genes associated with detectable traits. The use of the genetic maps of the present invention is described in 
more detail below. 

Use of the Hinh Density Biallelic Marker Mao to Identify 
Genes Associated with a Detectable Trait , 
One embodiment of the present invention comprises methods for identifying and isolating genes associated 
with a detectable trait using the biallelic marker maps of the present invention. 

to the past, the identification of genes linked with detectable traits has relied on a statistical approach called 
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linkage analysis. Linkage analysis is based upon establishing a correlation between the transmission of genetic markers 
and that of a specific trait throughout generations within a family, fn this approach, all members of a series of af foctcd 
families are gcnotyped with a few hundred markers, typically microsatellite markers, which arc distributed at an average 
density of one every 10 Mb. By comparing genotypes in an family members, one can attribute sets of alleles to parental 
haplotd genomes (haplotyping or phase determination). The origin of rccombined fragments is then determined in the 
offspring of all famflies. Those th3t co-segregate with the trait ere tracked. After pooling data from all families, 
statistical methods are used to determine the likelihood that the marker 3nd th8 trait are segregating independently in all 
families. As a result of the statistical analysis, one or several regions having a high probability of harboring a gene linked 
to the trait arc selected as candidates for further analysis. The result of linkage analysis is considered as significant (i.e. 
there is a high probability that the region contains a gene involved in a detectable trait) when the chance of independent 
segregation of the marker and the trait is lower than 1 in 1000 (erpressed as a LOD score > 3). Generally, the length 
of the candidate region identified using linkage analysis is between 2 and 20Mb. 

Once a candidate region ts identified as described above, analysis of recombinant individuals using additional 
markers allows further delineation of the candidate linked region. 

Linkage analysis studies have generally relied on the use of a maximum of 5,000 microsatellite markers, thus 
limiting the maximum theoretical attainable resolution of linkage analysis to ca. 600 kb on average. 

Linkage analysts has been successfully applied to map simple genetic traits that show cloar Mendclian 
inheritance patterns and which have a high penetrance (penetrance is the ratio between the number of trait positive 
carriers of allele 3 and the total number of a carriers in the population). About 1 00 pathological trait-causing genes were 
discovered using linkage analysis over the last 10 years. In most of these cases, the majority of affected individuals had 
affected relatives and the detectable trait was rarB in the general population (frequencies less than 0.1%). In about 10 
cases, such as Alzheimer's Disease, breast cancer, and Type II diabetes, the detoctabte trait was more common but the 
aOelc associated with the detectable trait was rare in the affected population. Thus, the alleles associated with these 
traits were not responsible for the trait in all sporadic cases. 

Linkage analysis suffers from a variety of drawbacks. First, linkage analysis is limited by its reliance on the 
choice of a genetic model suitable for each studied trait Furthermore, as already mentioned, the resolution attainable 
using linkage analysis is limited, and complementary studies are required to refine the analysis of the typical 2Mb to 
20Mb regions initially identified through linkage analysis. 

In addition, linkage analysis approaches have proven difficult when applied to complex genetic traits, such as 
those due to the combined action of multiple genes and/or environmental factors. In such cases, too large an effort and 
cost are needed to recruit the adequate number of affected families required for applying linkage analysis to these 
situations, as recently discussed by Risch, N. and Merikangas, IC [Science 273:1516-1517 (1996), the disclosure of 
which is incorporated herein by reference). 

Finally, linkage analysis cannot be applied to the study of traits for which no large informative families are 
available. Typically, this will be the case in any attempt to identify trait-causing alleles involved in sporadic cases, such 
as alleles associated with positive or negative responses to drug treatment 
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The present genetic maps and hialle&c markers (including the 653 biallelic markers obtained above, which 
include the sequences of SEQ ID Nos. 1-50 arid 5M00 or the sequences complementary thereto) may be used to 
identify and isolate genes associated with detectable traits using association studies, an approach which does not 
require the use of affected families and which permits the identification of genes associated with sporadic traits. 

Association studies are described in more detail balow. 

Association Studies 

As already mentioned, any gene responsible or partly responsible for a given trait will be in LD with some 
flanking markers. To map such a gene, specific alleles of these flanking markers which are associated with the gene or 
genes responsible for the trait are identified. Although the following discussion of techniques for finding the gene or 
genes associated with a particular trait using linkage disequilibrium mapping, refers to locating a single gene which is 
responsible for the trait it will be appreciated that the same techniques may also be used to identify genes which are 
partially responsible for the trait 

Association studies may be conducted within the general population (as opposed to the linkage analysis 
techniques discussed above which are Bmiied to studies performed on related individuals in one or several affected 
families). 

Association between a biallelic marker. A and a trait T may primarily occur as a result of three possible 
relationships between the biallelic marker and the trait 

First aliefe a of biallelic marker A may be directly responsible for trait T (e.g., Apo E eA site A and Alzheimer's 
disease). However, since the majority of the biallelic markers used in genetic mapping studies are selected randomly, 
they mainly map outside of genes. Thus, the likelihood of allele a being a functional mutation directly related to trait T is 
very low. 

Second, an association between a biallelic marker A and a trait T may 3lso occur when the biallelic marker is 
very closely linked to the trait locus. In other words, an association occurs when allele a is in linkage disequilibrium with 
the trah-causing aDele. Whan the biallelic marker is in doss proximity to a gene responsible for the trait more extensive 
genetic mapping will ultimately allow a gene to be discovered near the marker locus which carries mutations in people 
with trait T (i.e. the gene responsible for the trait or one of the genes responsible for the trait). As will be further 
exemplified below, using & group of biallefic markers which are in close proximity to the gene responsible for the trait the 
location of the causal gene can be deduced from the profile of the association curve between the biallelic markers and 
the trait Tfta causal gene will usually be found in the vicinity of the marker showing the highest association with the 
trait 

Finally, an association between a biaDolic marker and a trait may occur when people with the trait and people 
without the trait correspond to genetically different subsets of the population who, ccincidentally, also differ in the 
frequency of allele a (population stratification). This phenomenon may be avoided by using ethnically matched large 
heterogeneous samples. 

Association studies are particularly suited to the efficient identification of genes that present common 
polymorphisms, and are involved in multifactorial traits whose frequency is relatively higher than that of diseases with 
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monofactorial inheritance. 

Association studies mainly consist of four steps: recruitment of trait-positive fT+) and trait-negative (TO 
populations with well-defined phenotypes, identification of a candidate region suspected of harboring a trait causing 
gene, identification of said gene among candidate genes in the region, and finally validation of mutation(s) responsible for 
the trait in said trait causing gene. 

In a first step, trait+ and trait - phenotypes have to be wclf-dcfincd. In order to perform efficient and 
significant association txudiei such as Uiose described herein, the trait under study should preferably follow a bimodal 
distribution h the population under study, presenting two clear non-overlapping phenotypes, irait + and trait % 

Nevertheless, in the absence of such a bimodal distribution (as may in fact be the case for complex genetic 
traits), any genetic trait may still bo analyzed using the association method proposed herein by carefully selecting the 
individuals to be included in tfie trait + and trait - phenotypic groups. The selection procedure involves selecting 
individuals at opposite ends of the non-himodaJ phenotype spectrum of the trait under study, so as to include in these 
trait + and trait - populations individuals who dearly represent non-overlapping, preferably extreme phenotypes. 

The definition of the inclusion criteria for the trait + and trait — populations is an important aspect of the 
present invention. The selection of those drastically different hut relatively uniform phenotypes enables efficient 
comparisons in association studies and the possible detection of marked differences at the genetic level, provided that 
the sample sires of the populations under study are significant enough. 

Generally, trait + and trait - populations to be included in association studies such as those proposed in the 
present invention consist of phenotypically homogeneous populations of individuals each representing 100% of the 
corresponding phenotype if the trait distribution is bimodal. If the trait distribution is non-bimodal, trait + and trait - 
populations consist of phenotypically uniform populations of individuals representing each between 1 and 88%, 
preferably between 1 and 80%, mora preferably between 1 and 50%, and more preferably between 1 and 30%, most 
preferably between 1 and 20% of the total population under study, and selected among individuals exhibiting non- 
overlapping phenotypes. In some embodiments, the V and T groups consist of individuals exhibiting the extreme 
phenotypes within the studied population. The clearer the difference between the two trait phenotypes, the greater the 
probability of detecting an association with bialielic markers. 

In preferred embodiments, a first group of between 50 and 300 trait + individuals, preferably about 100 
individuals, are recruited according to their phenotypes. In each case, a similar number of trait negative individuals are 
included in such studies who are preferably both ethnically- and age-matched to the trait positive cases. Both trait and 
trait • individuals should correspond to unrelated cases. 

Figure 3 shows, for a series of hypothetical sample sees, the p-value significance obtained in association 
studies performed using individual markers from the high-density bialielic map, according to various hypotheses regarding 
the difference of allelic frequencies between the T+ and T- samples. It indicates that in an cases, samples ranging from 
150 to 500 individuals are numerous enough to achieve statistical significance. It will be appreciated that bigger or 
smaller groups can be used to perform association studies according to the methods of the present invention. 

in a second step, a marker/trait association study is performed that compares the genotype frequency of each 
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biaffelic marker in the above described T+ and T- populations by means of a chi square statistical test (one degree of 
freedom). In addition to this singfe marker association analysis, a haplotype association analysis is performed to define 
the frequency and the type of the ancestral carrier haplotype. Haplotype analysis, by combining the informativeness of a 
set of bialleBc markers increases the power of the association analysis, allowing false positive and/or negative data that 
may result from the single marker studios to be eliminated 

Genotyping can be performed using the microsequencing procedure described in Example 13, or any other 
genotyping procedure suitable for this intended purpose. 

If a positive association with a trait is identified using an array of biallolic markers having a high enough 
density, the causal gene will be physically located in the vicinity of the associated markers, since the markers showing 
positive association with the trait are in Gnkage disequilibrium with the trait locus. Regions harboring e gene responsible 
for a particular trait which are identified through association studies using high deiisity sots of hiallclic markers will f on 
average, be 20 - 40 times shorter in length than those identified by linkage analysis. 

Once a positive association is confirmed as described above, a third step consists of completely sequencing the 
BAC inserts harboring the markers identified in the association analyzes- These BACs are obtained through screening 
human genomic libraries with the markers probes and/or primers, as described above. Once a candidate region has been 
sequenced and analyzed, the functional sequences within the candidate region {e.g. eions. spGce sites, promoters, and 
other potential regulatory regions) are scanned for mutations which are responsible for the trait by comparing the 
sequences of the functional regions in a selected number of T+ and T- individuals using appropriate software. Tools for 
sequence analysis are further described in Example 14, 

Finally, candidate mutations arc then validated by screening a larger population of T+ and 
T- individuals using genotyping techniques described below. Polymorphisms are confirmed as 
candidate mutations when the validation population shows association results compatible with those 
found between the mutation and the trait in the test population. 

In practice, in order to define a region bearing a candidate gene, the trait ♦ and trait • populations are 
genotyped using an appropriate number of bialleltc markers. The markers may include one or more of the 653 markers 
obtained above (which include the sequences of SEQ ID Nos: 1-50 and 51-100 or the sequences complementary thereto. 

The markers used to define a region bearing a candidate gene rniy be distributed at an average density of 1 
marker per 10-200 kb. Preferably, the markers used to define a region bearing a candidate gene arc distributed at an 
average density of 1 marker every 15-150 kb. In further preferred embodiments, the markers used to define a region 
bearing a candidate gene are distributed at an average density of 1 marker every 20-1 QOkb. In yet another preferred 
embodiment, the markers used to define a region bearing a candidate gens are distributed at an average density of 1 
marker every 100 to 150kb. In a further highly preferred embodiment/the markers used to define a region bearing a 
candidate gene are distributed et an average density of 1 marker every 50 to lOOkb. In yet another embodiment the 
biallelic markers used to define a region bearing a candidate gene are distributed at an average density of 1 marker every 
25-50 kilobases. As mentioned above, m order to enhance the power of linkage disequilibrium based maps, in a preferred 
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embodiment, the marker density of the map will be adapted to take the Bnkage disequilibrium distribution in the genomic 
region of interest into account. 

In some embodiments, the initial identification of a candidate genomic region harboring a gene associated with 
a detectable phenotype may be conducted using a preliminary map containing a few thousand biallelic markers. 
Thereafter, the genomic region harboring the gens responsible for the detectable trait may be better delineated using a 
map containing a larger number of biaflsfic markers. Furthermore, the genomic region harboring the gene responsible for 
the detectable trait may bo further delineated using a high density map of biallelic markers. Finally, the gens associated 
with the detectable trait may be identified and isolated using a very high density biallelic marker map. 

Example 11 describes a hypothetical procedure for identifying a candidate region harboring a gene associated 
with a delectable trait It will be appreciated that although Example 11 compares the results of analyzes using markers 
derived from maps having 3,000, 20,000, and 80,000 markers, the number of markers contained in the map is not 
restricted to these exemplary figures. Rather, Example 1 1 exemplifies the increasing refinement of the candidate region 
with increasing marker density. As increasing numbers of markers are used in the analysis, points in the association 
analysis become broad peaks. The gene associated with the detectable trait under investigation will lie within or near 
the region under the peak. 

Example 1 1 

Identification of a Candidate Renmn Harboring a 
Gene Associated with a Detectable Trait 

The initial identification of a candidate genomic region harboring a gena associated with a detectable trait may 
be conducted using a genome-wide map comprising about 20,000 biallelic markers. The candidate genomic region may 
be further defined using a map having a higher marker density, such as a map comprising about 40,000 markers, about 
60,000 markers, about 80,000 markers, about 1 00,000 markers, or about 1 20,000 markers. 

The use of high density maps such as those described above allows the identification of genes which are truly 
associated with detectable traits, since the coincidental associations will be randomly distributed along the genome 
while the true associations wSf map within one or mora discrete genomic regions. Accordingly, biallelic markers located 
in the vicinity of a gene associated with a detectable trait will give rise to broad peaks in graphs plotting the frequencies 
of the biallelic markers in T+ individuals versus T~ individuals, in contrast, biallelic markers which are not in the vicinity 
of the gene associated with the detectable trait will produce unique points in such a plot. By determining the 
association of several markers within the region containing the gene associated with the detectable trait, the gene 
associated with the detectable trait can be identified using an association curve which reflects the difference between 
the allele frequencies within the T+ and T- populations for each studied marker. The gene associated with the 
detectable trait will ba found in the vicinity of the marker showing the highest association with the trait. 

Figures 4, 5, and 6 illustrate the above principles. As illustrated in Figure 4, an association analysis conducted 
with a map comprising about 3,000 biallelic markers yields a group of points. However, when an association analysis is 
performed using a denser map which includes additional biallelic markers, the points become broad peaks indicative of 
the location of a gene associated with a detectable trait For example, the biallelic markers used in the initial association 
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anatysis may be obtained from a map comprising about 20,000 biallelic markers, as illustrated in Figure 5. In some 
embodiments, one or more of the 653 biallelic markers obtained above {which include the sequences of SEO ID Nos. 1-50 
and 5M00 or the sequences complementary thereto) are used in the association analysis. 

In the hypothetical example of Figure 4, the association analysis with 3,000 markers suggests peaks near 
markers 9 and 17. 

Next a second analysis is performed using additional markers in the vicinity of markers 9 and 1 7, as illustrated 
in the hypothetical example of figure 5 f using a map of about 20.000 markers. This step again indicates an association 
in the close vicinity of marker 17, since more markers in this region show an association with the trait. However, none 
of the additional markers around marker 9 shows a significant association with tho trait which makes marker 9 a 
potential false positive. In some embodiments, one or more of the 653 biallelic markers obtained above {which include 
the sequences of SEQ ID Nos. 150 and 5M00 or the sequences complementary thereto) are used in the second 
analysis. In order to further test the validity of these two suspected associations, a third analysis may be obtained with 
a map comprising about 00,000 biallelic markers. In some embodiments, one or more of the 653 bJalfelic markers 
obtained above are used in the third association analysis. In the hypothetical example of figure G, more markers lying 
around marker 17 exhibit a high degree of association with the detectable trait. Conversely, no association is confirmed 
in the vicinity of marker 9. The genomic region surrounding marker 17 can thus be considered a candidate region for the 
hypothetical trait of this simulation. 

The statistical power of ID mapping using a high density marker map is also reinforced by complemonting the 
single point association analysis described above with a multi-marker association analysis, called haplotype analysis. 

When a chromosome carrying a disease allele is first introduced into a population as a result of either mutation 
or migration, tlie mutant allele necessarily resides on a chromosome havino a unique set of linked markers: the ancestral 
haplotype. As already mentioned, a haplotype association analysis allows the frequency and the type of the ancestral 
carrier haplotype to be defined. 

A haplotype analysis is performed by estimating the frequencies ol all possible haplotypes for a given set of 
hiallefic markers in the T+ and T- populations, and comparing these frequencies by means of a chi square statistical test 
(one degree of freedom). Haplotype estimations are usually performed by applying the Expectation-Maximization {EM) 
algorithm (Excoffier I and Slatkin M, Mol BfoL EvqL 1Z-921-927 (1995). the disclosure of which is incorporated herein 
by reference), using the EM-HAPLO program (Hawiey ME, Pakstis AJ & fCidd KK, Am J. Phys. AnthropoL 1 8:1 04 
(1994), the disclosure of which is incorporated herein by reference). The EM algorithm is used to estimate haplotype 
frequencies in tho case when only genotype data from unrelated individuals are available. The EM algorithm is a 
generafized iterative maximum likelihood approach to estimation that is useful when data are ambiguous and/or 
incomplete. 

To improve the statistical power of the individual marker association analyses conducted as described above 
using maps of increasing marker densities, haplotype studies can be performed using groups of markers located in 
proximity to one another within regions of the genome. For example, using the methods described above in which the 
association of an individual marker with a detectable phenotype was analyzed using maps of 3 f 0D0 markers, 20,000 
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markers, and 60,000 markers a series of haplotypc studies can be performed using groups of contiguous markers from 
such maps or from maps having higher marker densities. 

In a preferred embodiment, a scries of successive haplotype studies including groups of markers spanning 
regions of more than 1 Mb may be performed. In some embodiments, the biaflefic markers included in ench of these 
groups may be located wfthin a genomic region spanning less than Ikb, from 1 to 5kb, from 5 to 10kb ( from 1 0 to 25kb, 
from 25 to 50kb, from 50 to 150kb, from 150 to 250kb, from 250 to SOQkb, from 500kb to 1Mb. or more than 1Mb. 
Preferably, the genomic regions containing the groups of biallclic markers used in the successive haplotype analyses are 
overlapping. It will be appreciated that the groups of biollcfic markers need not completely cover the genomic regions of the 
above-specified lengths but may instead be obtained from incomplete contigs having one or more gaps therein. As discussed 
in further detail below, biaflefic markers may be used in single point and haplotype association analyses regardless of the 
completeness of the corresponding physical corrfig harboring them. 

Without wishing to be limited to any particular numerical value, it is believed that those haplotypes displaying a 
coefficient of relative risk above 1, preferably about 5 or more, preferably of about 7 or more are Indicative of a 
•significant risk" for the individuals carrying the identified haplotype to develop the given trait. However, it is difficult to 
evaluate accurately quantified boundaries for the so-called 'significant risk - . Indeed, and as it has been demonstrated 
previously, several traits observed in a given population are multifactorial in that they are not only the result of a single 
genetic predisposition but also of other factors such as environmental factors. Thus, the evaluation of a significant risk 
must take these parameters into consideration in order to, in a certain manner, weigh the potential importance of 
external parameters in the development of a given trait Thus, tho relative risk which constitutes a 'significant risk" to 
develop a given trait is evaluated differently depending on the trait under consideration and the populations tested. 

Genome wide mapping using association studies with dense enough arrays of markers permit a case-by-case 
best estimate of p-value significance thresholds. Given a test population comprising two ethnically matched trait 
positive and trait negative groups of about 50 to about 500 individuals or more, conducting the above described 
association studies wilt allow a p-value 'cut off to be established by, for example, analyzing significant numbers of 
allele frequency differences or, in some cases where appropriate, running computer simulations or control studies as 
described in Examples 1 1, 20, and 31. 

For b p-value above the threshold, a corresponding association between the trait and a studied marker will be 
deemed not significant while for a p-value below such a threshold, said association will be deemed significant. If the p- 
value is significant, the genomic region arround the marker will be further scrutinized for a trait-causing gene. 

It is preferred that p-valua significance thresholds he assessed for each case/control population comparison. 
Beth the genetic distance between sampled population-'stratification'-and the dispersion due to random selection of 
samples may indeed influence the p-value significance thresholds. 

It will be appreciated that the above approaches may be conducted on any scale {Le. over the whole genome, a 
set of chromosomes, a single chromosome, a particular subchromosomal region, or any other desired portion of the 
genome). As mentioned above, once significance thresholds have been assessed, population sample sizes may be 
adapted as exemplified in figure 3. 
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Example 12 below illustrates the increase in statistical power brought to an association study by a haplotype 

analysis. 

Example 12 

~ "Heplotype Analysis? IpCTtifica^ ' ~ 

a nnnomic rBoin o associated with Abheimer 's Disease fAD) 

As shown in Table 3 within Example 10, at an average map density of one marker per 40 kb only one marker 
(99-365/344 ) out of five random biaflelic markers from a ca. 200 kb genomic region around the Apo E gene showed a 
dear association to AD (delta allelic frequency in cases and controls - 18% ; p value - 6.9 E-10). The allelic frequencies 
of the other four random markers were not significantly different between AO cases and controls (p-va!ues £ E-01). 
However, since linkage disequilibrium can usually be detected between markers located further apart than an average 40 
kb as previously discussed, one should expect that, performing an association study with a local excerpt of a biallefic 
marker map covering ca. 200kb with an average inter-marker distance of ca. 40kb should allow the identification of 
more than one biaHefic marker associated with AO. 

A haplotype analysis was thus performed using the biallefic markers 99-344/439; 99-355/219; 99-359/308 ; 
99-365/344 ; and 99-366/274 (of SEQ 10 No* 301-305 and 307-31 1}. 

In a first step, marker 99-365/344 that was already found associated with AD was not included in the 
haplotype study. Only btaOclic markers 99-344/439 ; 99-355/219 ; 99 359/308 ; and 99-366/274. which did not show 
any significant association with AD when taken individually, were used. This first haplotype anoh/sis measured 
frequencies of all possible two-, three-, or four-marker haplotypcs in the AO case and control populations. As shown in 
Figure 7, there was one haplotype among all the potential different haplotypcs based on the four individually non- 
significant markers ("haplotype 8", TAGG comprising SEQ ID No. 305 which is the T allele of marker 99-366/274, SEQ 
ID No. 301 which is the A allele of marker 99-344/439, SEQ ID No. 303 which is the G allele of marker 99-359/308 and 
SEO ID No. 302 which is the G allele of marker 99-355/219), that was present at statistically significant different 
frequencies in the AD case and control populations (A-12% ; p value - 2.05 E-06}. Moreover, a significant difference 
was already observed for a three-marker haplotype included in the above mentioned "haplotype 8" ("haplotype 7", TGG. 
A- 10% ; p value - 4.76 E-05). Haplotype 7 comprises SEO ID No. 305 which is the T allele of marker 99-366/274, 
SEQ ID No. 303 which is the G allele of marker 99-359/308 and SEQ ID No. 302 which is the G allele of marker 99- 
355/219}. The haplotype association analysis thus clearly increased the statistical power ol the individual marker 
association studies by more than lour orders of magnitude when compared to single-marker analysis (from p values £: E- 
01 for the individual markers - see Table 3 - to p value £ 2 E-06 for the four-marker "haplotype 8"). 

The significance of the values obtained for this haplotype association analysis was evaluated by the following 
computer simulation. The genotype data from the AD cases and the unaffected controls were pooled and randomly 
allocated to two groups which contained the same number of individuals as the case/control groups used to produce the 
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data summarized in Figure 7. A four-marker haplotype analysis (99-344/439 ; 99-355/219; 99-359/308; and 99- 
366/274) was run on these artificial groups. This experiment was reiterated 100 times and the results arc shown in 
Figure 8. No haplotype among those generated was found for which the p-value of the frequency difference between 
both populations was more significant than 1 E-05. In addition, only 4% of the generated haplotypes showed p-values 
tower than 1 E-04. Since both these p-value thresholds are less significant than the 2 E-06 p-value showed hy 
Tiaplotype 8', this haplotype can be considered significantly associated with AD. 

In a second step, marker 99-365/344 was included in the haplotypo analyzes. The frequency differences 
between the affected and non affected populations was calculated for all two-, three-, four- or five-marker haplotypes 
involving markers: 99-344/439 ; 99-355/219; 99-359/308; 99-366/274; and 99 365/344. The most significant p. 
values obtained in each category of haplotype (involving two. three, four or five markers) were examined depending on 
which markers were involved or not within the haplotype. This showed that all haplotypes which included marker 99- 
365/344 showed a significant association with AO (p-values in the range of E-04 to E-l 1>. 

An additional way of evaluating the significance of the values obtained in the haplotype association analysis 
was to perform a similar AD case-control study on biallefic markers generated from BACs containing inserts 
corresponding to genomic regions derived from chromosomes 13 or 21 and not known to be involved in Alzheimer's 
disease. Performing similar haplotype and individual association analyzes as those described obova and in Example 10 
did not generate any significant association results (all p-values for haplotype analyzes were less significant than E-03; 
all p-values for single marker association studies were less significant than E-02). 

The results described in Examples 10 and 12, generated from individual and haplotype studies using a biallclic 
marker set of an average density equal to ca. 40kb in the region of an Alzheimer's disease trait causing gene, indicate 
that all biallelic markers of sufficient informative content located within a ca. 200 kb genomic region around a TCA can 
potentially be succcsfufly used to localize a trait causing gene with the methods provided by the present invention This 
conclusion is further supported by the results obtained through measuring the linkage disequilibrium between markers 
99-365/344 or 99-359/308 and ApoE 4 Site A marker within Alzheimer's patients: as one could predict since LD is the 
supporting basis for association studies, LD between these pairs of markers was enhanced in the diseased population vs. 
the control population. In a similar way as the haplotype analysis enhanced the significance of the corresponding 
association studies. 

Once a given polymorphic site has been found and characterized as a biallelic marker according to the methods 
of the present invention, several methods can be used in order to determine the specific allele carried by an individual at 
the given polymorphic base. 

In some embodiments, genotyping will be applied to one or more of the markers of SEQ ID Nos: 301-305 and 
307-31 1 or the sequences complementary thereto. In additional embodiments, genotyping will be applied to the markers 
of SEQ ID Nos. 3Q6 and 312 as well as one or more of the markers of SEQ ID Nos. 301-305 and 307-311, in some 
embodiments, genotyping will be applied to one or more of the 653 bialielic markers obtained above (which include the 
sequences of SEQ ID Nos. 1-50 and 51-100 or the sequences complementary thereto). The present invention further 
contemplates the genotyping of any biallelic marker within the provided maps, including those that are in linkage 
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disequilibrium with the 653 biaflelic markers obtained above {which include the sequences of SEQ ID Nos. 1-50 and 51- 
100 or the sequences complementary thereto) or the markers of SEO fD Nos. 301-312 or the sequences complementary 
thereto. 

Most genotyping methods require the previous amplification of e DMA region carrying the polymorphic sito of 

interest. 

The identification of biaflelic markers described previously, allows the design of appropriate oligonucleotides, 
which can be used as primers to ampSfy a DNA fragment containing the polymorphic site of interest and for the 
detection of such polymorphisms. 

In particularly preferred embodiments, pairs of primers of SEO ID Nos: 313-318 and 319 324 may be used to 
Ocnerate ampficons harboring the markers of SEQ ID Nos: 301-306/307 312 or the sequences complementary thereto. In 
further embodiments, pairs of amplification primers may be used to generate ampficons harboring the 653 markers 
obtained above (which include the sequences of SEO ID Nos. 1-50 and 51-100 or the sequences complementary thereto. 
In highly preferred embodiments, pairs of the amplification primers of SEO 10 Nor. 101-150 and 151-200 may be used 
to generate ampficons harboring the markers of SEQ ID Nos: 1-50 and 51-100 or the sequences complementary thereto. 

It will be appreciated that ampfification primers may be designed having any length suitable for their intended 
purpose, in particular any length allowing their hybridization with a region of the DNA fragment to be amplified. 

It will be further appreciated that the hybrufization site of said amplification primers may be located at any 
distance from the polymorphic base to be genotyped. provided said amplification primers allow the proper ampfification 
of a DNA fragment carrying said polymorphic site. The amplification primers may be oligonucleotides of 10, 15, 20 or 
more bases in length which enable the amplification of the polymorphic site in the markers, fn some embodiments, the 
ampfification product produced using these primers may be at feast 100 bases in length (Lb. on average 50 nucleotides 
on each side of the polymorphic base). In other embodiments, the amplification product produced using these primers 
may be at least 500 bases in length (i.e. on average 250 nucleotides on each side of the polymorphic base). In still 
further embodiments, the amplification product produced using these primers may be at least 1000 bases in length (i.a. 
on average 500 nucleotides on each side of the polymorphic base). 

The ampfification of polymorphic fragments can be carried as described in Example 6 on DNA samples 
extracted as described in Example 5. 

As already mentioned, allele frequencies of biaflelic markers tested in association studies (individual or 
haplotype] may be determined using microsequencing procedures. 

A first step in microsequencing procedures consists m designing microsequencing primers adapted to each 
biallefic marker to be genotyped. Microsequencing primers hybridize upstream of the polymorphic base to be genotyped, 
either with the coding or with the non-coding strand. Microsequencing primers may be oligonucleotides of 8. 10, 15, 20 
or more bases in length. Preferably, the 3' end of the microsequencing primer is immediately upstream of the 
polymorphic base of the biaflelic marker being genotyped, such that upon extension of the primer, the polymorphic base 
is the first base incorporated. Such microsequencing primers are included within the scope of the present invention. 
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In preferred embodiments, the microsequencing primers are those indicated as features within the sequence 
listings corresponding to markers of SEQ !D Nos: 325-330/331-336. In some embodiments, the 653 bialJelic markers 
obtained above (which include the sequences of SEQ 10 Nos. 1-50 and 51-100 or the sequences complementary thereto) 
are genotyped using appropriate microsequencing oligonucleotides such as those of SEQ ID Nos. 201-250 or 251-300. 

It will ba appreciated that the biaflelic markers of the present invention may be genotyped usm^ 
microsequencing primers having any desirable length, and hybridizing to any of the strands of the marker to ho tested, 
provided their design is suitable for their intended purpose. In some embodiments, the amplification primers or 
microsequencing primers may be labeled. For example, in somo embodiments, the amplification primers or 
microsequencing primers may be biotmylated. 

Typical microsequencing procedures that can be used in the context of the present invention are described in 
Example 13 below. 

Example 13 

Genotyping of biallelic markers usinp microsequencing procedures 
Several microsequencing protocols conducted in liquid phase are weD known to those skilled in the art. A first 
possible detection analysis allowing the allele characterization of the microsequencing reaction products relies on 
detecting fluorescent ddNTP- extended microsequencing primers after gel electrophoresis. A first alternative to this 
approach consists in performing a liquid phase microsequencing reaction, the analysis of which may be carried out in 
solid phase. 

For example, the microsequencing reaction may be performed using 5'-biotinylated oligonucleotide primers and 
ftoorcsecin-dideoxynucleotides. The biotinylated oligonucleotide is annealed to the target nucleic acid sequence 
immediately adjacent to the polymorphic nucleotide position of interest. It is then specifically extended at its 3'-end 
following a PCR cycle, wherein the labeled dideoxynucleotide analog complementary to the polymorphic base is 
incorporated. The biotinylated primer is then captured on a microliter plate coated with streptavidin. The analysis is 
thus entirely carried out in a microtiter plate format. The incorporated ddNTP is detected by a fluorescein antibody - 
alkaline phosphatase conjugate. 

In practice this microsequencing analysts is performed as follows. 20 jiA of the microsequencing reaction is 
added to 80 f/i of capture buffer {SSC 2X. 2594 PEG 8000, 0.25 M Tris pH7.5, 1.8% BSA, 0.05% Tween 20) and 
incubated for 20 minutes on a microtiter plate coated with streptavidin (Boehringer). The plate is rinsed once with 
washing buffer {0.1 M Tris pH 7.5, 0.1 M NaCI, 0.1 % Tween 20). 100 fj\ of anti-fhiorescein antibody conjugated with 
phosphatase alkaline, diluted 1/5000 in washing buffer containing 1.8% BSA is added to the microtiter plate. The 
antibody is incubated on the microtiter plate for 20 minutes. After washing the microtiter plate four times, TOO jj\ of 4- 
methylumbelliferyl phosphate (Sigma) diluted to 0.4 mgfml in 0.1 M diethanolamme pH 9,6, 10rnM MgCI 2 are added. The 
detection of the microsequencing reaction is carried out on a fluorimeter (Oynatech) after 20 minutes of incubation. 

As another alternative! solid phase microsequencing reactions have been developed, for which either the 
oligonucleotide microsequencing primers or the PCR-ampEfied products derived from the DNA fragment of interest are 
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immobilized For example; immobilization can be carried cut via an interaction between biotinylated DNA and 
streptavidin-coated micromration wells or avidrn-coated polystyrene particles. 

As a further alternative, tha PCR reaction generating the ampficons to be genotyped can be performed directly 
in solid phase conditions, following procedures such as those described in WO 96/1 3G09, the disclosure of which is 
incorporated herein by reference. 

In such solid phase microsequencing reactions, incorporated ddNTPs can either be radiolabeled (see Syvanen. 
CEn, Chim. Acte. 226:225-236 (1934), the disclosure of which is incoiporated herein by reference) or linked to 
fluorescein (see livak and Haincr, Hum. MataL 3:379-385 {1994), the disclosure of which is incorporated herein hy 
reference). The detection of radiolabeled ddNTPs can be achieved through scmtillation-based techniques- The detection 
of fiuorescein-Bnked ddNTPs can be based on the binding of antifluorcscein antibody conjugated with alkaline 
phosphatase, followed by incubation with a chromogenic substrate (such as p nitrophenyl phosphate). 
Other possible reporter-detection couples for use in the above microsequencing procedures includs : 
ddNTP Cnked to dinitrophenyl (DNP) and anti-DNP alkaline phosphatase conjugate (see Harju et aL, Clin 
CheimZ$[U?\ l):2282-2287 (1993), incorporated herein by reference) 

biotinylated ddNTP and horseradish peroxidaso-conjugated streptavidin with o-phenylenediarnine as a substrate (see 
WO 92/15712, incorporated herein by reference). 

A diagnosis kit based on fluorescein-linked ddNTP with anlifluorescein antibody conjugated with alkaline 
phosphatase has been commercialized under the name PRONTO by GamidaGen Ltd. 

As yet another alternative microsequencing procedure, Nyren et at. \Anal. Biachcm 208:171-175 (1993), the 
disclosure of which is incorporated hereto by reference) have described a solid-phase ONA sequencing procedure that 
relies on the detection of DNA polymerase activity by an enzymatic luminometric inorganic pyrophosphate detection 
assay (EUOA). In this procedure, the FCR-ampGfied products are biotinylated and immobilized on beads. The 
microsequencing primer is annealed and four aliquots of this mixture are separately incubated with DNA polymerase and 
one of tha four different ddNTPs, After the reaction, the resulting fragments arc washed and used as substrates in a 
primer extension reaction with all four dNTPs present. The progress of the DNA-directed polymerization reactions is 
monitored with the ELIDA. Incorporation of a ddNTP in the first reaction prevents the formation of pyrophosphate during 
the subsequent dNTP reaction. In contrast, no ddNTP incorporation in the first reaction gives extensive pyrophosphate 
release during the dNTP reaction and this leads to generation of light throughout the ELIDA reactions. From the ELIDA 
results, the identity of the first base after the primer is easily deduced. 

It will be appreciated that several parameters of the above-described microsequencing procedures may be 
successfully modified by those skilled in the art without undue experimentation. In particular, high throughput 
improvements to these procedures may be elaborated, following principles such as those described further below. 

It will be further appreciated that any other genotyping jprocedure may be applied to the genotyping of biallelic 

markers. 
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Once the candidate region has been delineated using the high density biallcGc marker map, a sequence analysis 
process will allow the detection of all genes located within said region, together with a potential functional 
characterization of said genes. The identified functional features may allow preferred trait-causing candidates to be 
chosen from among the identified genes. More bialiefic markers may then be generated within said candidate genes, and 
used to perform refined association studies that will support the identification of the ttalt causing gene. Sequence 
analysis processes are described in Example 14 below. 

Example 14: Sequence Analysis 
DNA sequences, such as BAC inserts, containing the region carrying the candidate gene associated with the 
detectable trait are sequenced and their sequence is analyzed using automated software which eliminates repeat 
sequences while retaining potential gene sequences. The potential geno sequences are compared to numerous databases 
to identify potential exons using a set of scoring algorithms such as trained Hidden Markov Models, statistical analysts 
models (including promoter prediction tools) and the GRAIL neural network. Preferred databases for use in this analysis, 
the construction and use of which are further detailed in Example 22 below, include the following: 

NetGene database: 

This proprietary database contains sequences ol 5' cONA tag*, obtained from a number of tissues and cells. 
Currently more than 50,000 different 5' clones representing more than 50.000 different genes are included in NetGene. 
The sequences in the NetGene database correspond specifically to the 5 # regions of transcripts (first exons) and 
therefore allow mapping of the beginning of genes within raw genomic sequences. 

NRPU (Non-Redundant Protein-Unique) database : 

NRPU is a non-redundant merge of the publicly available NBRF/PIR. Genpept, and SwissProt databases. 
Homologies found with NRPU allow the identification of regions potentially coding for already known proteins or related 
to known proteins (translated exons). 

WREST (Non-Redundant EST database): 

NREST is a merge of the EST subsection of the publicly available GcnBank database. Homologies found with 
NREST allow the location of potentially transcribed regions (translated or non-translated exons). 

NRN (Non-Redundant Nucleic acid database): 
NRN is a merge ol GenBank, EMBL and thoir daily updates. 

Any sequence giving a positive hit with NRPU, NREST or an 'excellent- score using GRAIL or/and other scoring 
algorithms is considered a potential functional region, and is then considered a candidate for genomic analysis. 
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While this first screening allows tie detection of the 'strongest" exons, a semi-automatic scan is further 
applied to the remaining sequences in the context of the sequence assembly. That is, the sequences neighboring a 5' 
site or an exon are submitted to another round of bioinformatics analysis with modified parameters. In this way, new 
exon candidates are generated for genomic analysis. 

Using the above procedures, genes associated with detectable treits may be identified. 

Examples 15-23 flfustrate the application of the above methods using biallelic markers to identify a Rene 
associated with a complex disease, prostate cancer, within a ca. 450 kb candidate region. Additonol details of the 
identification of the gene associated with prostate cancer are provided in the U.S. Patent Application entitled "Prostate 
Cancer Gene" {GENSET.018A, Serial No. 08/996,306). the disclosure of which is incorporated herein by reference. 



in 



Use of Biallelic Markers to I dentify a Gene Associated with Prostate Cancer 
Substantial amounts of LOU data supported the hypothesis that genes associated with distinct cancer types 
are loceted within a particular region of the human genome. More specifically, this region was likely to harbor a gene 
assented with prostate cancer. Association studies were performed as described below in order to identify this 
prostate cancer gene. A YAC contig containing the genomic region suspected of harboring a gene associated with 
prostate cancer was constructed as described in Example 15 below. 

Example 15 

YAC Contig Co nstruction in the Candidate Genomic Region 
First, a YAC contig which contains the candidate genomic region was constructed as follows. The CEPH- 
Genetbon YAC map for the entire human genome (Chumakov et al. (1895). supra) was used for detailed contig building 
the genomic region containing genetic markers known to map in the candidate genomic region. Screening data available 
for several publicly available genetic markers were used to select a set of CEPH YACs localized within the candidate 
region. This set of YACs was tasted by PCR with the above mentioned genetic markers as wall as with other publicly 
available markers supposedly located within the candidate region. As a result of these studies, a YAC STS contig map 
was generated around genetic markers known to map in this genomic region. Two CEPH YACs were found to constitute 
a minimal tffing path in this region, with an estimated size of ca. 2 Megabases. 

During this mapping effort, several publicly known STS markers were precisely located within the contig. 
Example 16 below describes the identification of sets of biallelic markers within the candidate genomic region. 

Example 16 
BAC contin ennstructfrn anil 
Biallelic Markers isolati on within the candidate chromosomal reoion. 
Next, a BAC contig covering the candidate genomic region was constructed as follows. BAC libraries were 
obtained as described in Woo et &l,Nucleic Acids Res. 22:49224931 0994). the disclosure of which is incorporated 
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herein by reference. Briefly, the two whole human genome BamHI and Hindlll libraries already described in Example 1 
were constructed using the pBctaBAC1 1 vector (Kim et at, (1996). supra). 

The BAG libraries were then screened with all of the above mentioned STSs r following the procedure described 
in Example 2 above. 

The ordered BACs selected ty STS screening and verified by FISH, were a&cmblcdTmo contigs and new 
markers were generated by partial sequencing of insert ends from some of them. Ttiese markers were used to fill the 
gaps in the contig of BAG clones covering the candidate chromosomal region having an estimated size of 2 megahascs. 

Figure 9 illustrates a minimal array of overlapping clones which was chosen for further studies, and the 
positions of tho publicly known STS markers along said contig. 

Selected BAC clones from the contig were subcloned and sequenced, essentially following the procedures 
described in Examples 3 and 4. 

Biallelic markers lying along the contig were identified fallowing the processes described in Examples 5 and 6. 
Figure 9 shows the locations of the biallelic markers along the BAC contig. This first set of markers 
corresponds to a medium density map of the candidate locus, with an inter-marker distance averaging 5Qkb-1 5Qkb. 

A second set of biallelic markers was then generated as described above in order to provide a very high- density 
map of the region identified using the first set of markers which can be used to conduct association studies, as 
explained below. This very high density map has markers spaced on average every 2-50kb. 

The biallelic markers were then used in association studies/ ONA samples were obtained from individuals 
suffering from prostate cancer and unaffected individuals as described in Example 1 7, 

Example 17 

Collection of DMA Samples from Affected and Non-affected Individuals 
Prostate cancer patients were recruited according to clinical inclusion criteria based on pathological or radical 
prostatectomy records- Control cases included in this study were both ethnically- and age-matched to the affected 
cases; they were checked for both the absence of all clinical and biological criteria defining the presence or the risk of 
prostate cancer, and for the absence of related familial prostate cancer cases. Both affected and control individuals 
were aO unrelated. 

The two following groups of independent individuals were used in the association studies. The first group, 
comprising individuals suffering from prostate cancer, contained 185 individuals. Of these 185 cases of prostate 
cancer, 47 cases were sporadic and 138 cases were familial The control group contained 104 non-diseased individuals. 

Haplotype analysis was conducted using additional diseased (total samples: 281) and control samples (total 
samples: 130), from individuals recruited according to similar criteria* 

ONA was extracted from peripheral venous blood of all individuals as described in Example 5. 

The frequencies of the biallelic markers in each population were determined as described in Example 18. 

Example 18 
Benotvoino Affected and Control Individuals 

Genotyping was performed using the following microsequenting procedure. 
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Amplification was performed on each DMA sample using primers designed as previous]/ explained. The pairs of primers 
ware used to generate ampficons harboring the biallelic markers 93-123, 4-26, 4-14, 4-77, 99-217, 4-87, 99-213. 93- 
221, 99-135, 99-1482, 4-73, end 4-65 using the protocols described in Example 8 above, 

Microscquonclng primers were designed for each of the biallelic markers, as previously described. 
After purification of the amplification products, the miaosequcnemg reaction mixture was prepared by adding, in a 20//I 
final volume: 10 pmol mrcrosequencing oligonucleotide, 1 U Thermoscqucnase (Amersham E79000G), 1.25 //I 
Thermosequenase buffer (260 mM Trts HCI pH 9.5, 05 wNl MgCI 2 ), and the two appropriate fluorescent ddNTPs (Perkin 
Elmer, Dye Terminator Set 401095) complementary to the nucleotides at the polymorphic site of each biallelic murker 
tested, following the manufacturer's recommendations. After 4 minutes at 84 Q C, 20 PCR cycles of 15 sec ot 55°C, 5 
sec at 72*C, and 10 sec at 84°C were carried out in a Tetrad PTC-225 thermocycler fMJ Research). The 
unincorporated dye terminators were thon removed by ethane! precipitation- Samples were finally resuspended in 
fonnarnide-EDTA loading buffer and heated for 2 mrn at 95°C before being loaded on a polyacrylamide sequencing gel. 
The data were collected by an ABI PRISM 377 ONA sequencer and processed using the GENESCAN software {Parkin 
EbnerK 

following gel analysis, data were automatically processed with software that allows the determination of the 
alleles of biallelic markers present in each amplified fragment 

The software evaluates such factors as whether the intensities of the signals resulting from the above 
micfosequencing procedures are weak, normal or saturated, or whether the signals are ambiguous. In addition, the 
software identifies significant peaks (according to shape and height criteria). Among the significant peaks, peaks 
corresponding to the targeted she are identified based on their position. When two significant peaks are detected for 
the same position, each sample is categorized as homozygous or heterozygous based on the height ratio. 

Association analyzes were then performed using the biallelic markers as described below. 

Example 19 
Association Analysis 

Association studies were run in two successive steps. In a first step, a rough localization of the candidate 
gene was achieved by determining the frequencies of the biallelic markers of Figure 9 in the affected and unaffected 
populations. The rasuhs of this rough localization are shown in Figure 10. This analysis indicated that a gene 
responsfole for prostate cancer was located near the blallalic marker designated 4-67. 

In a second phase of the analysis, the position of the gene responsible for prostate cancer was further refined using the 
very high density set of markers including the 99-123, 4-26, 4-14, 4-77, 99-217, 4-67, 99-213, 99-221, 99 135, 99- 
1482, 4-73, ond 4-65 markers. 

As shown in Figure 11 r the second phase of the analysis confirmed that thB gene responsible for prostate 
cancer was near the biallelic marker designated 4-67, most probably within a ca. ISOkb region comprising the marker. 

A haplotype analysis was also performed as described in Example 20. 
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Example 20 
Haolotvoe analysis 

Tha allelic frequencies of each of the alleles of friaflclic markers 99-123, 4-26, 4-14, 477, 99-217, 4-67. 99- 
213, 99-221, and 99-135 were determined in the affected and unaffected populations. Table 4 lists the internal 
idenfification numbers of the markers used in the haplotype analysis, the alleles of each marker, the most frequenl allele 
in both unaffected individuals and individuals suffering from prostate cancer, the least frequent allele in both unaffected 
individuals and individuals suffering from prostata cancer, and the frequencies of the least frequent alleles in each 
population. 

Table 4 

Frequency of least frequent allele * * 



Markers 


Polymorphic base * 


Cases 


Controls 


93-123 


C/T 


0.35 


0.3 


4-26 


A/G 


0.39 


0.45 


4-14 


C/T 


0.35 


0.41 


4-77 


C/G 


0.33 


0.24 


99-217 


C/T 


0.31 


0.23 


4-67 


err 


0.26 


0.16 


99-213 


T/C 


0.45 


0.36 


99-221 


C/A 


0.43 


0.43 


99-135 


A/G 


0.25 


0.3 



most frequent allele/lcast frequent allele 
standard deviations - 0.023 to 0.031 for controls 

•0.018 to 0.021 for cases 

Among all ths theoretical potential Afferent haplotypes based on 2 to 8 markers. 11 haplotypes showing a 
strong association with prostate cancer were selected. Tha results of these haplotype analyzes are shown in FiQure 1 2. 

figures 1 1. and 12 aggregate association analysis results with sequencing results - generated following the 
procedures f urthor described in Example 21 - which permitted the physical order and/or the distance between markers to 
he estimated. 

The significance of the values obtained in figure 12 are underscored hy the following results of computer 
simulations* For the computer simulations, the data from the affected individuals and the unaffected controls were 
pooled and randomly allocated to two groups which contained the same number of individuals as the affected and 
unaffected groups used to compile the data summarized in figure 12 A haplotype analysis was run on these artificial 
groups for the six markers included in haplotype 5 of figure 12 This experiment was reiterated 100 times and the 
results are shown in Figure i3. Among 100 iterations, only 5% of the obtained haplotypas are present with a p value 
less significant than E-04 as compared to the p-value of 9 c -07 for haplotype 5 of figure 12 Furthermore, for haplotype 
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5 of Figure 12, only 6% cf the obtained haplotypes have a significance level bolow 5*03, while none of them show a 
significance level below 5*03. 

Thus, using the data of figure 13 and evaluating the associations for single marker alleles or for haplotypes 
will permit estimation of the risk a corresponding carrier has to develop prostate cancer. It will be appreciated that 
significance thresholds df relative risks will be more finely assessed according to the population tested. 

Diagnostic techniques for determining an individual's risk of developing prostate cancer may be implemented as 
described below for the markers in the maps of the present invention, including the 99-123, 4-26, 4-14, 4-77, 99-21 7, 
4-B7, 99-213, 99-221, and 39-135 markers. 

The above haplotypc analysis indicated that 171kb of genomic DNA between biaJlelic markers 4-14 and 99- 
221 totally or partially contains a gene responsible for prostete cancor. Therefore, the protein coding sequences lying 
within this region were characterized to locate the gene associated with prostate cancer. Tliis analysis, described in 
further detail below, revealed a singfe protein coding sequence in the 171 kb genomic region, which was designated as 
the PG1 gene. 

Example 21 

Identification of the Genomic Sermcncc in the Candidate Region 
Template DNA for sequencing the PGt gene was obtained as follows. BACs E and F from fig. 9 were subdoned 
as previously described. Plasmid inserts were first amplified by PCR on PE 9600 thennocyefcrs (Parkin-Elmer), using 
appropriate primers, AmpIiTaqGold (Perkin-Efmer), dNTPs (Boehringer), buffer and cycling conditions as recommended by the 
Perkin-Elmcr Corporation. 

PCR products were then sequenced using automatic ABI Prism 377 sequencers (Perkin Elmer, Applied Biosystems 
Division, Foster City, CA). Sequencing reactions were performed using PE 9500 thermocyclers (Perkin Elmer) with standard 
dye-primer chemistry and ThermoSequenase [Amersham life Science). The primers were labeled with the JOE, FAM, ROX 
and TAMRA dyes. The dNTPs and ddNTPs used in the sequencing reactions were purchased from Boehringer. Sequencing 
buffer, reagent concentrations and cycling conditions were as recommended by Amersham. 

Following the sequencing reaction, the samples were precipitated with EtOH, resuspended in formamide loading 
buffer, and loaded on a standard 4% aciylamide geL Electrophoresis was performed for 2.5 hours at 3000V on an ABI 377 
sequencer, and the sequence data were collected and analyzed using the ABI Prism ONA Sequencing Analysis Software, 
version 2.1.2. 

The sequence data obtained as described above were transferred to a proprietary database, whore quality control 
and validation steps were performer! A proprietary base-caller flagged suspect peaks, taking into account the shape cf the 
peaks, the inter-peak resolution, and the noise leveL Ttie proprietary base-caller also performed an automatic trimming. Any 
stretch of 25 or fewer bases having more than 4 suspect peaks was considered unreliable and was discarded* 

The sequence fragments from BAC subclones isolated as described above were assembled using Gap4 
software from R. Staden (Bonfield et al. 1995). this software allows the reconstruction of a single sequence from 
sequence fragments. The sequence deduced from the alignment of different fragments is called the consensus 
sequence. Directed sequencing techniques (primer walking) were used to complete sequences and link contigs. 
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Polential f unctional sequences were then identified as described m Example 22. 

Example 22 
Identification of Functional Sequences 
Potential exons in BAC-derived human genomic sequences were located by homology searches on protein, nucleic 
acid and EST (Expressed Sequence Tags) public databases. Mam public databases were locally reconstructed as mentioned 
in Example 14. The protein database, NRPU (Nun redundant Protein Unique) is formed by a non-redundant fusion of tho 
Genpept {Benson et aL, Nucleic Adds Res. 24:1-5 (1996), the disclosure of which is incorporated herein by reference), 
Swissprot (Bairoch, A. and ApweJor, Nucleic Adds Res. 24:21-25 (1995), the disclosure of which is incorporated herein 
by reference) and PIR/NBRF (Goorgo et eL, Nucleic Acids Res. 24:17-20 (1995), the disclosure of which is incorporated 
herein by reference) databases. Redundant data were eliminated by using the NRDB software (Benson et at. (1 996), supra) 
and internal repeats were masked with the XMU software (Benson et al„ supral Homologies found usmg the NRPU 
database allowed the identification of sequences corresponding to potential coding exons rotated to known proteins. 

The EST local database is composed by the gbest section (1-9) of GenBank (Benson et aL (1 996), supra), and thus 
contains all publicly available transcript fragments. Homologies found with this database allowed the localization of 
potentially transcribed regions. 

The local nucleic add database contained alt sections of GenBank and EMBL (Rodriguez-Tome et al.. Nucleic Acids 
Res. 24:6*12 (1996), the disclosure of which is incorporated herein by reference) except the EST sections. Redundant data 
were eliminated as previously described. 

Similarity searches in protein or nucleic acid databases were performed using the BLAST software (Altschul ct al., 
J. MqL BioL 215:403410 (1990), the disclosure of which is incorporated herein by reference). Alignments were refined 
using tfic fasta software, and multiple alignments used Clustal W. Homology thresholds were adjusted for each analysis 
based on the length and the complexity of the tested region as well as on the size of the reference database. 

Potential exon sequences identified as above were used as probes to screen cDNA libraries. Extremities of positive 
clones were sequenced and the sequence stretches were positioned on the genomic sequence determined above. Primers 
were then designed using the results from these afignments in order to enable the cloning of cONAs derived from the gene 
associated with prostate cancer that was identified using the above procedures. 

The obtained cDNA molecules were then sequenced and results of Northern blot analysis of prostate mRNAs 
supported the existence of a major cDNA having a 5-Bkb length. The structure of the gene associated with prostate cancer 
was evaluated as described in Example 23. 

Example 23 
Analysis of Gene Structure 

The intron/exon structure of the gene was finally completely deduced by aligning the mRNA sequence from the 
cDNA^obtatned as described above and the genomic DNA sequence obtained as described above. This alignment 
permitted the determination of the positions of the introns and exons, the positions of the start and end nucleotides 
defining each of the at (east 8 axons, the locations and phases of the 5' and 3* splice sites, the position of the stop 
codon, and the position of the polyadenylation site to be determined in the genomic sequence. This analysis also yielded 
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the positions of the coding region in the mRNA, and the locations of the polyadenylation signal and pofyA stretch in the 
mRNA. 

The gens identified as described above comprises at least 8 exons and spans more than 52kb. A G/C rich 
putative promoter region was identified upstream of the coding sequence. A CCAAT tn the putative promoter was also 
identified. The promoter region was identified as described in Prestridga, D.S_ Predicting Pol II Promoter Sequences 
Using Transcription Factor Binding Sites, J. Mot. Biol 249:923-932 (1995), the disclosure of which is incorporated 
herein by reference. 

Additional analysts using conventional techniques, such as a 5'RACE reaction using the Marathon-Ready 
human prostate cDNA kit from Ciontech (Catalog. No. PT1 150-1), may be performed to confirm that tha 5' of the cDNA 
obtained above is the authentic 5' end in the mRNA. 

Alternatively, the 5'scquence of the transcript can be determined by conducting a PCR amplification with a 
series of primers extending from tha 5'end of the identified coding region. 

The above methods were also used to identify biallelic markers in a gene which was an attractive candidate for 
a gene associated with asthma. Examples 24-31 show how the use of methods of the present invention allowed this 
gene to be identified as a gene responsible, at least partially, for asthma in the studied populations. Additional details of 
the identification of the gene associated with asthma are provided in U.S. Provisional Application Serial Nos. 
60/081,893 (Genset.026PR) and U.S. Provisional Patent Application Gensct.026PR2, the disclosures of which are 
incorporated herein by reference. 

Example 24 

Detection of halleSc markers in the candidate oene: DNA extraction 
Donors were unrelated and healthy. They presented a sufficient diversity for being representative of a French 
heterogeneous population. The DNA from 100 individuals was extracted and tested for the detection of the biallelic 
markers. 

30 ml of peripheral venous blood ware taken from each donor in the presence of EOTA. Cells (pellet] were 
collected after centrifugation for 10 minutes at 2000 rpm. Red cells were lysed by a lysis solution (50 ml final volume : 
10 mM Tris pH7.6; 5 mM MgCl2; 10 mM NaCO. The solution was centrifuged (10 minutes, 2000 rpm) as many times as 
necessary to eliminate the residual red cells present in the supernatant after resuspension of the pellet in the lysis 
solution. 

The pellet of white cells was lysed overnight 8t 42°C with 3.7 ml of lysis solution composed of: 
- 3 ml TE 10-2 fTris-HC1 10 mM, EDTA 2 mM) / Nad 0.4 M 

• 200//ISDS10% 

• 500 fA K-proteinase (2 mg K-proteinase in TE 10-2 f NaCI 0.4 Ml 

For the extraction of proteins, 1 ml saturated NaCI (6M) {113.5 v/v) was added. After vigorous agitation, the 
solution was centrifuged for 20 minutes at 10000 rpm. 

For the precipitation of ONA, 2 to 3 volumes of 100% ethanol were added to the previous supernatant and the solution 
was centrifuged for 30 minutes at 2000 rpm. Tha DNA solution was rinsed three times with 70% ethanol to eliminate 
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salts, and centrifugad far 20 minutes at 2000 rpm. The pellet was dried at 37°C, and resuspended in 1 ml TE 10-1 or 1 
ml water. The DMA concentration was ovaluatod by measuring the OD at 260 run 0 unit 00 - 50 /jglml DNA). 

To determine the presence of proteins in the DNA solution, the OD 260 / 00 280 ratio was determined. Only 
DNA preparations having a 00 260 1 00 280 ratio between 1.8 and 2 were used in the subsequent examplos described 
below. 

Tho pool was constituted by mixing equivalent quantities of ONA from each individual. 

Example 25 

Detection of the hinHeftc markers: amplification of ncnomic DMA by PCR 
The amplification of specific genomic sequences of the DNA samples of Example 24 was carried out on the 
pool of DNA obtained previously. In addition, 50 individual samples were similarly amplified. 

PGR assays were performed using the following protocol: 



Final volume 25 pi 

°NA 2ng///l 

M0CI2 2mM 

dNTP(each) 200 j/M 

primer (each) 2.9ng///l 

Ampli Taq Gold DNA polymerase 0.05 unit///l 

PGR buffer (10x - 0.1 M TrisHCI pH8.3 0.5M KCI) 1x 



Pairs of first primers were designed to amplify the promoter region, exons r and 3' end of the candidate asthma- 
associated gene using the sequence information of the candidate gene and the OSP software (Hiilior & Green, 1991}. 
These first primers were about 20 nucleotides in length and contained a common oligonucleotide tail upstream of the 
specific bases targeted for amplification which was useful for sequencing. The synthosis of these primers was 
performed following the phosphoramidite method, on a GENSET UFPS 24.1 synthesizer. 

DNA amplification was performed on a Genius II thermocycler. After heating at 94°C for 10 min, 40 cycles 
were performed. Each cycle comprised; 30 sec at S4°C, 55°C for 1 min, and 30 sec at 72° C. For final elongation, 7 min 
at 72° C ended the amplification. The quantities of the amplification products obtained were determined on 96-weli 
microtiter plates, using a fluorometer and Picogreen as intercalant agent (Molecular Probes). 

Example 26 

Detection of the biallelic markers: sequencing of amplified oenomic DNA and identification of polymorphisms 
The sequencing of the amplified ONA obtained in Example 25 was carried out on ABI 377 sequencers. The 
sequences of the amplification products were determined using automated dideoxy terminator sequencing reactions with 
a dye terminator cycle sequencing protocoL The products of the sequencing reactions were run on sequencing gels and 
the sequences were analyzed as formerly described. 
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The sequence data were further evaluated using the above mentioned polymorphism analysis software 
designed to detect the presence of biallelic markers among the pooled amplified fragments. The polymorphism search 
was based on the presence of superimposed peaks in the electrophoresis pattorn resulting from different bases occurring 
at the same position as described previously. 

Six fragments of amplification were analyzed. In these segments, 8 biallelic markers were detected. The 
localization of the biallelic markers, the polymorphic bases of each allele, and the frequencies of the most frequent 
alleles was as shown in Tabic 5. 

Tnhlo 5 



Amplicon MarkerName Origin of ONA Localization in Polymorphism Frequency 



gene 


1 


204/326 


bid. 


Promoter 


A/G 


96.2(G) 


2 


32/357 


Pool 


Intron 1 


A/C 


67.7(C) 


3 


33/175 


Ind. 


Exon 2 


c/r 


87.3 (C) 


3 


33/234 


Pool 


Intron 2 


A/C 


56.7 (C) 


3 


33f327 


Ind. 


Intron 2 


C/T 


75.3 (T) 


5 


35/358 


Pool 


Intron 4 


C/G 


67.9 (G) 


5 


35/390 


Ind. 


Intron 4 


C/T 


82(C) 


6 


36/164 


Ind. 


Exon 5 


A/G 


99.5(G) 



Allelic frequencies were determined in a population of random blood donors from French Caucasian origin* Their wide 
range is due to the fact that besides screening a pool of 1 00 individuals to generate biaHoiic markers as described 
above, polymorphism searches were also conducted in an individual testing format for 50 samples. This strategy was 
chosen here to provide a potential shortcut towards the identification of putative causal mutations in the association 
studies using them. As the 36/1 64 biallelic marker was found in only one individual this marker was not considered in 
the association studies. 

The fourth fragment of amplification carrying exon 3 {not shown in the Table) was not polymorphic in the 
tested samples "0 pool + 50 individuals). 
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Examote 27 

Validation of the polymorphisms throunh microsenuencinn 
The bialielic markers identified in Example 26 were further confirmed and their respective frequencies were 
determined through microsoquencing. Microsequencing was carrisd out for each individual DNA sample described in 
Example 24. 

Amplification from genomic DNA of individuals was performed fay PCR as described above for the detection of 
the bialielic markers with the same set of PCR primors doscribsd above. 

The preferred primers used in microsequencing had about 19 nuclootides in length and hybridized just upstream 
of the considered polymorphic base. 

Five primers hybridized with the nan-coding strand of the gene. For the bialielic markers 204/326, 35/350 and 36/164. 
primers hybridized with tho coding strand of the gene. 

The microsequencing reaction was performed as described in Example 18. 

Example 28 

Association study between asthma and the biaHelic markers of the candidate nene: collectio n of DNA samples frnm 

affected and nnnaffected individuals 
The asthmatic population used to perform association studies in order to establish whether the candidate gene 
was an asthma-causing gene consisted of 298 individuals. More than 90 % of thBsa 298 asthmatic individuals had a 
Caucasian ethnic background. 

The control population consisted of 373 unaffected individuals, among which 279 French (at least 70 % were 
of Caucasian origin) and 94 American (at least 90 % wore of Caucasian origin). 

DNA samples were obtained from asthmatic and norvasthmatic individuals as described above. 

Example 29 

Association study hetween asthma and the bialielic markers of the candidate oene: geno tvoinn of affected and cnntrol 

individuals 

The general strategy to perform the association studies was to individually scan the DNA samples from all 
individuals in each of the populations described above in order to establish the allele frequencies of the above described 
bialleGc markers m each of these populations. 

Allelic frequencies of the above-described bialielic markers to each population were determined by performing 
microsequencing reactions on amplified fragments obtained by genomic PCR performed on tho DNA samples from each 
individual. Genomic PCR and microsequencing were performed as detailed above in Examples 25 and 27 using the 
described amplification and microsequoncing primers. 

Example 30 

Association study between asthma and the bialielic markers of the c andidate nene 
Table 6 shows the results of the association study between five bialielic markers in the candidate gene and 

asthma. 
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Table 6 





Allelic 


frequencies {%) 






Markers 


Asthmatics 


Controls 


Frequency diff. 


P value 




298 individuals 


373 individuals 






32/357 


A 38.6 


A 23.8 


8.8 


7.34x1 0* 


33/234 


A 49 


A 44.3 


4.7 


8.86x10' 2 


33/327 


T78.5 


T74.6 


3.9 


1.0x10-' 


35/359 


G 7Z3 


6 66.9 


5.4 


3.53x1 0* 2 


35/390 


T30.4 


T20.3 


10.1 


2.33x10 s 



As shown in Table 6, markers 32/357 and 35/390 presentod a strong association with asthma, this association being 
highly significant ( pvalue - 7.34x10-4 for marker 32/357 and 2.33x10 5 for marker 35 390). 

Three markers showed moderate association when tested independently, namely 33/234, 33/327, 35/3G8. 
It is worth mentioning that allelic frequencies for each of the biallelic markers of Table 6 ware separately 
measured within the French control population (279 individuals} and the American control populalion (94 individuals). 
The differences in allele frequencies between the two populations were between 1 % and 7%. with p-valucs above 1 0'\ 
These data confirmed that the combined French/American control population (373 individuals) was homogeneous enough 
to be used es a control population for the present association study. 

Example, 31 

Association studies: Haplotvne frequency analysis 

As already shown, one way of increasing the statistical power of individual markers, is by performing 
haplotype association analysis. A haplotype analysis for association of markers in the candidate gone and asthma was 
performed by estimating the frequencies of all possible haplotypes for biallolic markers 32/357, 33/234. 33/327, 35/358 
and 35/390 in the asthmatic and control populations described in Example 30 (Table 6), and comparing these frequencies 
by means of a chi square statistical test (one degree of freedom). Haplotype estimations were performed by applying the 
Expectation-Maximization (EM) algurithm (Excoffier L & Slatkin M, 1995, Mol.Biol.Evol. 12 :921-927), using the EM- 
HAPLO program (Hawley ME, Pakstis A J & Kidd KK. 1994, AmJ.Phys.Anthropol. 18 : 104). 

The results of such haplotype analysis are shown in Table 7. 
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Tabic 7 



llapJulype 
frequencies 



Markers 


32/357 


331234 


33D27 


35/358 


35/330 


Aslhm. 


Controls 


Odds ratio 


P value 


Frequency diff. 


8.6 


4.7 


3.9 


5.4 


10.1 










1* value 


7.34x10 4 


8.86x10' 2 


1.0x10** 


3.53x10 2 


2.33x10 s 










Haptotype 1 


A 








T 


0.2 


0.11 


2.02 


8.47x10 5 


10 Haplotype 2 




A 


r 


G 




0.27 


0.18 


1.68 


2.81 ilO** 


Haplotype 3 


A 


A 


T 


G 


T 


0.10 


0.09 


2.22 


3.95x10 s 



A two-marker haplotype covering markers 32/357 and 35/390 (haplotype 1, AT alleles respectively) presented 
15 a p value of 8.47x1 0-6, an odds ratio of 2.02 and haplotype frequencies of 0.2 for asthmatic and 0.1 1 for control 

populations respectively. 

A three-marker haplotype covering markers 33/234, 33/327 and 35/350 (haplotype 2. ATG alleles respectively) 
presented a p value of 2.81x104. an odds ratio of 1.68 and haplotype frequencies of 0.27 for asthmatic and 0.18 for 
control populations respectively. 

20 A five-marker haplotype covering markers 32/357, 33/234, 33/327.35/358 and 35/390 (haplotype 3. AATGT 

alleles respectively) presented a p value of 3.95x1 0-5, an odds ratio of 2.22 and haplotype frequencies of 0.1 8 for 
asthmatic and 0.09 [for control populations respectively. 

Haplotype association analysis thus increased the statistical power of the individual marker association 
studies when compared to single-marker analysis (from p values between 10' 1 and 2X10' 5 for the individual markers to p 

25 values between 3X10* and 0(1-0* for the three-marker haplotype, haptotype 21 

The significance of the values obtained for the haplotype association analysis was evaluate d by the following 
computer simulation test. The genotype data from the asthmatic and control individuals were pooled and randomly 
allocated to two groups which contained the same number of individuals as the trait positive and trait negative groups 
used to produce the data summarized in Table 7. A haplotype analysis was then run on these artificial groups for the 

30 three haplotypes presented in Table 7. This experiment was reiterated 1000 times and the results are shown in Tablo 8. 
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Haplotype 
Haplotype 1 
fA-Tl 

Haplotype 2 
(ATCJ 
Haplotype 3 
{AATCT) 



Chi-Square 



19.70 



13.49 
16.65 



-64- 
Table 8 

Permutation Test 
Average Chi-Squaro 

1.2 

1.2 
1.2 



Maximal Chi-Squaro P value 



11.6 



10.5 



9.3 



1.0x10^ 



1.0x10 3 



1.0x1 0 3 



The results in Table 8 show that among 1000 iterations only 1%. of the obtained haplotypes lias a pvaluc 
comparable to the one obtained in Tablo 7. 

These results dearly validate the statistical significance of tiio haplotypes obtained (haplotypes 1, 2 and 3, 

Table 7). 

While Examples 15-31 illustrate the use of the maps and markers of the present invention for identifying a nes 
gene associated with a complex disease within a 2Mb genomic region for establishing that a candidate gene is, al least 
partially, responsible for a disease the maps and markers of the present invention may also bo used to identify one or 
more biallelic markers or one or more genes associated with other detectable phenotypes, including drug response, drug 
toxicity, or drug efficacy. The biallelic markers used in such drug response analyses or shown, using the methods of the 
present invention to be associated with such traits, may lie within or near genes responsible for or partly responsible for 
a particular disease, for example a disease against which the drug is meant to act. or may lie within genomic regions 
which are not responsible for or partly responsible for a disease. For example, the genomic region harboring markers 
associated with a particular drag response may carry a drug metabolism gene, or a gene encoding a protein with a role in 
the drag response mechanism. Thus, biallelic markers within or near genes known to be involved in drag response, 
toxicity, or efficacy or genes suspected of being involved in drug response, toxicity, or efficacy may be usBd to identify 
individuals likely to respond positively or negatively to drag treatment In the context of the present invention, a "positive 
response- to a medicament can be defined as comprising a reduction of the symptoms related to the disease or condition 
to be treated. In the context of the present invention, a "negative response* to a medicament can be defined as 
comprising either a lack of positive response to the medicament which does not lead to a symptom reduction or to a 
side-effect observed following administration of the medicament 

Orug efficacy, response and tolerance/toxicity can be considered as multifactorial traits involving a genetic 
component m the same way as complex diseases such as Alzheimer's diseasa, prostate cancer, hypertension or diabetes. 
As such, the identification of genes involved in drug efficacy and toxicity could be achieved following a positional cloning 
approach, e.g. performing linkage analysis within families in order to ohtain the subchromosomal location of the gene(s). 
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However, this type of analysis is actually impractical in the cas8 of drug responsiveness, due to the lack of availability of 
familial cases. In fact, the likelihood of having more than one individual in a particular family being exposed to the same 
drug at the same time is very low. Therefore, drug efficacy and toxicity can only bs analyzed as sporadic traits. 

In order tp conduct association studies to analyze the individual response tn a givon drug in groups of patients 
affected with □ disease, up to four groups are screened to determine their patterns of biallolic markers using the 
techniques described above. The four groups arc, 

- Non-diseased or random controls, 

• Oisoased patients/drug rcspnnders, 

- Diseased patients/drug non-rcspondcrs, 

* Diseased patients/drug side effects. 

In preferred embodiments, the above mentioned groups are recruited according to phenotyping criteria having 
the characteristics described above, so that the phenotypes defining the different groups are non-overlapping, preferably 
extreme phenotypes. 

In highly preferred embodiments, such phenotyping criteria havo the bimodal distribution described above. 
The final number and composition of the groups for each drug association study is adapted 
to the distribution of the above described phenotypes within the studied population. 

After selecting a suitable population, association and haplotype analyses may be performed as 
described herein to identify one or more biatlelic markers associated with drug response, preferably drug toxicity or drug 
efficacy. The identification of such one or more biallefic markers allows one to conduct diagnostic tests to determine 
whether the administration of a drug to an individual will result in drug response, preferably drug toxicity, or drug 
efficacy. 

Tho mothods described above for identifying a gene associated with prostate cancer and bialleiic markers 
indicative of a risk of suffering from asthma may be utilized to identify genes associated with other detectable 
phenotypes. In particular, the above methods may be used with any marker or combination of markers included in the 
maps of the present invention, including the 653 bialleiic markers obtained above (which include the soquences of SEQ 
ID Nos. 1-50 and 51-100 or the sequences complementary thereto), the PG1 markers, the asthma-associated markers, 
and tho Apo E markers of SEQ ID Nos. 301-305/307-31 1 or the sequences complementary thereto. As described above, 
the general strategy to perform the association studies using the maps and markers of the present invention is to scan 
two groups of individuals {trait positive individuals and trait negative controls) characterized by a well defined phonotype 
in order to measure the allele frequencies of the bialleiic markers in each of these groups, Preferably, tho froquencies of 
markers with inter-marker spacing of about 150 kb are determined in each groups. More preforably, the frequencies of 
markers with inter-marker spacing of about 75 kb are determined in each group. Even more preferably, markers with 
inter-marker spacing of about 50 kb, about 37.5kb, about 30kb. or about 25kb will be tested in each population. For 
genome-wide studies, it will be preferred to measure the frequencies of about 20,000, or about 40,000 biallefic markers 
in each group. In a highly preferred embodiment, the frequencies of about 60,000, ahout 80,000, about 100,000, or 
about 120.000 bialleiic markers are determined in each group. In some embodiments, haolotype analyses may be mn 
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using groups of markers located within regions spanning less than 1kb, from 1 to 5kb. from 5 to 1Gkb f from 10 to 25kb, 
from 25 to 50kb f from 50 to 150kb, from 150 to 250kb, from 250 to 500kb, from 500kb to 1Mb, or more than 1Mb. 

Allele frequency can be measured using microsequencing techniques described herein: preferred high 
throughput microsoquoncing procedures are further exemplified below; it will be further appreciated that any othor large 
scato gonotyping method suitable with the Mended purpose contemplated herein may also be used. 

In some embodiments of the present invention a computer-based system may support the on-line coordination 
between the identification of biaUelic markers and the corresponding analysis of their frequency in tho different groups. 

It will be appreciated that it is not necessary to use a full high density biafielic marker map in order to start a 
genome-wide association study. It is sufficient to generate and use a first set of about 20,000 markers (one marker per 
BAC, average inter-marker spacing of about 150kb). Maps having higher densities of biafielic markers (two or more 
markers per BAC. average inter-marker spacing of about 75kb or less) may then be generated by starting first on those 
OACs for which a candidate association has been estabbshed at the first step. 

In cases when one or more candidate regions have previously been delineated, such as cases where a particular 
fienB or genomic region is suspected of being associated with a trait, local excerpts of biaflelic marker maps having 
densities above one marker per 1 50kb may be exploited using BACs harboring said genomic regions, or genes, or portions 
thereof. In these cases also, successive association studies may be performed using sets of biaUelic markers showing 
increasing densities, preferably from about one overy 150 kb to about one every 75kb: more preferably, sets of markers 
with inter-marker spacing below about SOkh, below about 37.5kb, below about 30kb, most preferably below about 25 
kb, will be used 

Haplotype analyses may also be conducted using groups of biallelic markers within the candidate region. The 
biallelic markers included in each of these groups may be located within a genomic region spanning loss than Ikb, from 1 
to 5kb r from 5 to lOkb, from 10 to 25kb, from 25 to 50kb, from 50 to 150kb, from 150 to 250kb, from 250 to 500kb, 
from 500kb to 1Mb, or more than 1Mb. ft wfll be appreciated that the ordered DNA fragments containing these groups of 
biallelic markers need not completely coyer the genomic regions of thoso lengths but may instead be incomplete contigs 
having one or mors gaps therein. As discussed in further detail below, biallelic markers may be used in association studies 
and haplotype analyses regardless of the completeness of the corresponding physical contig harboring them, provided linkage 
diseqiriBbrium between the markers can be assessed. 

As described above, if a positive association with a trait, such as a disease, or a drug efficacy and/or toxicity, 
is identified using the biallelic markers and maps 7 of the present invention, the maps will provide not only the 
confirmation of the association, but also a shortcut towards the identification of the gene involved in the trait under 
study. As described afjove, since the markers showing positive association to the trait are in linkage disequilibrium with 
the trait loci, the causal gene wfll be physically located in the vicinity of these markers. Regions identified through 
association studies using high density maps will on average have a 20 -40 times shorter length than those identified by 
linkage analysis (2 to 20 Mb). 

As described above, once a positive association is confirmed with the high density biaUelic marker maps of the 
present invention, BACs from which the most highly associated markers were derived are completely sequenced and the 
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mutations in the causal gene are searched by applying genomic analysis tools. As described above, once a region 
harboring a gene associated with a detectable trait has been sequenced end analyzed, the candidate functional regions 
|e.g. exons and splice sites, promoters and other regulatory regions) are scanned for mutations by comparing the 
sequences of a solected number of controls and cases, using adequate software. 

In some embodiments, trait positive samples boing compared to idontify causal mutations arc selectod among 

those carrying the ancestral haplotypc; in these cmbodlmonts. connol samples are chosen from individuals not carrying 

said ancestral haplotype. 

In further embodiments, trait positive samples being compared to identify causal mutations are selected among 
those showing haplotypcs that are as close as possible to the ancestral haplotype; in these embodiments, control 
samples are chosen from individuals not carrying any of the haplotypcs selected for the case population. 

The mutation detection procedure is essentially similar to that used for biallelic site identification. A pair of 
oligonucleotide primers are designed in order to amplify the sequences to be tested. In preferred embodiments, priority is 
given to the testing of functional sequences; in such embodiments, sequences covering every exon/promotur predicted 
region, preferably including potential splice sites, are determined and compared between the T+ and T- populations. 
Amplification is carried out on DNA samples from T+ and T- individuals using the polymerase chain reaction under the 
above described conditions. To be sequenced, amplification products from genomic PCR may be subjected to automated 
dideoxy terminator sequencing reactions and electrophoresed on ABI 377 sequencers. Following gel image analysis and 
DNA sequence extraction. ABI sequence data are automatically analyzed to detect the presence of sequence variations 
among T+ and T- individuals. Sequences are preferably verified by comparing the sequences of both DNA strands of 
each individual 

It is proferred that candidate polymorphisms be then verified by screening a larger population of cases and 
controls by means of any genotypinn procedure such as those described herein, preferably using a microsequencing 
technique in an individual test format. Polymorphisms are considered as candidate mutations when present in cases and 
controls at frequencies compatible with the expected association results. 

The maps and biallelic markers of the present invention may also be used to identify patterns of biallelic 
markers associated with detectable tra'rts resulting from polygenic interactions. The analysis of genetic interaction 
between alleles at unlinked loci requires individual genotyping using the techniques described herein. The analysis of 
allelic interaction among a selected set of biaJleGc markers with appropriate p-values can be considered as a haplotype 
analysis, similar to those described in further details within the present invention. 

Use of Biallelic Markers to Ide ntify Individuals Likely to Exhibit a Detectable 
Trait Associ ated with a Particular Allele of a Known Gene 
In addition to their utility in searches for genes associated with detectable traits on a genome-wide. chromosome- 
wide, or subchrumosoraal level, the maps and biallelic markers of the present invention may be used in mora targeted 
approaches for identifying individuals likely to exhibit a particular detectable trait or individuals who exhibit a particular 
detectable trait as a consequence of possessing a particular allele of a gene associated with the detectabla trait For 
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exampte, the biallelic markers and maps of the present invention may be used to identify individuals who carry an allele of a 
known gene that is suspected of being associated with a particular detectable trait In particular, the target genes may be 
genes having alleles which predispose an individual to suffer from a specific disease state. In other cases, the target genes 
may be genes having alleles that predispose an individual to exhihit a desired or undosirerf response to a drug or other 
pharmaceutical composition, a food, or any administered compound. The known gene may encode any of a variety of typos 
of biamolccufes. For example, the known genes targeted in such analyzes may bo genes known to be involved in a particular 
step in a metabolic pathway in which disruptions may cause a detectable trait Alternatively, the target genes may be genes 
encoding receptors or Ogands which bind to receptors in wfrich disruptions may cause a detectable trait {jenes encoding 
transporters, genes encoding proteins with signaling activities, genes encoding proteins involved in the immune response, 
genes encoding proteins involved in hematopcesis, or genes encoding proteins involved in wound healing. It will be 
appreciated that the target genes are not Diluted to those specifically enumorated above, but may be any gene known to 
be or suspected of being associated with a detectable trait 

As previously mentioned, the rneps and markers of the present invention may be used to identify genes 
associated with drug response. Accordingly, the present invention comprises a method of using a drug comprising 
obtaining a nucleic acid sample from an individual determining the identity of the polymorphic base of one or more 
biallelic markers obtained by the mothods described above which is or are associated with a positive response to 
treatment with the drug or one or more bialleBc markers obtained by the methods described above which is or are 
associated with a negative response to treatment with the drug, and administering the drug to the individual if the 
nucleic acid sample contains one or more alleles of biallelic markers associated with a positive response to treatment 
20 with the drug or if said nucleic acid sample lacks one or more alleles of biallelic markers associated with a negative 

response to tlie druQ. In some embodiments of the method, the administering step comprises administering the drug to 
the individual if the nucleic acid sample contains one or more alleles of biallelic markers associated with a positive 
response to treatment with the drug and the nucleic add sample lacks one or more alleles of biallelic markers associated 
with a negative response to the drug. 

25 The biallelic markers of the present invention may also be used to select individuals for inclusion in 

the clinical trials of a drug. By selecting individuals who are likely to respond favorably to a drug for inclusion in the 
trial, the effectiveness of the drug can be assessed without lowering the measured effectiveness as a result of including 
noiwesponders or negative rasponders in the clinical trial May be more importantly, using such selection may avoid 
including patients who may suffer from undesirable side effects if administered the drug under trial, thus increasing the 

30 safety of clinical trials. Accordingly, the present invention also includes a method of selecting an individual for inclusion 

in a clinical trial of a drug comprising obtaining a nucleic acid sample from an individual, determining the identity of the 
polymorphic base of one or more bialtelic markers obtained by the methods described above which is or are associated 
with a positive response to treatment with the drug or one or more biallelic markers associated with a negative response 
to treatment with tho drug in the nucleic acid sample, and including the individual in the clinical trial if the nucleic acid 

35 sample contains ona or more alleles of biallelic markers obtained by the methods described above which is or are 

associated with a positive response to treatment with said drug or if the nucleic acid sample lacks one or more alleles of 
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bialteBc markers associated with a negative response to the drug. In one embodiment of the method, the inclusion step 
comprises including the individual in the clinical trial if the nucleic acid sample contains one or more alleles of biallelic 
majors associated with a positive response to treatment with the drug and the nucfeic acid sample lacks one or more 
alleles of biallelic markers associated with a negative response to the drug. 

In particular embodiments, one or several of (he ApoE linked markors of SEQ ID Nos 301-305/307-311 or the 
sequences complementary thereto may be used in targeted approaches to identify individuals who arc likely to develop 
Ahhoimcr's disease, or to identify individuals who do suffer from Alzheimer's disease. In olher embodiments, one or mora of 
the markers of SEQ ID Nos. 306 and 312 and one or more of (he (he ApoE linked markers of SEQ ID Nos 301-305/307-31 1 
or (he sequences complementary (hereto are genotyped approaches to identify individuals who arc likely to develop 
Alzheimer's disease, or to identify individuals who dn suffer from AbWs disease. In further embodiments, one or several 
of (he PG1 finked markers may be (es(ed in targeted approaches to identify individuals who are likely to develop prostate 
cancer, or to identify individuals who do suffer from prostate cancer. Really individuals likely to be asthmatic, or asthmatic 
■ndnnduals, can be identified using one or mere of the asthma-associattd markers to conduct the procedures of the present 



invention. 



Given (he high number of cancer types in which (he FBI chromosomal region is involved, it will be appreciated that 
the PG1 markers may be employed to identify individuals at risk of developing cancers olher than prostate cancer, or to 
identify individuals suffering from cancers other than prostate cancer. It will be further appreciated that the asthma- 
associated markers may be tested to identify individuals likely to exhibit or exhibiting, inflammatory traits other than the 
asthmatic state (e.g. arthritis, or psoriasis, among others). The present invention provides adequate methods to establish 
associations between markers, such as those meniioned above and candidal (raits expressly contemplated herein, thus 
legitimating the corresponding targeted approaches to identify individuals likely to exhibit or exhibiting said candidate (raits. 

In some embodiments, the 653 biallelic markers obtained above (which include (he sequences of SEQ ID Nos. 
1-50 and 51-100 or the sequences complementary thereto) may be used in targeted approaches to identify individuals at 
risk of developing a detectable (rait for example a complex disease or desired/undesirod drug response, or to identify 
individuals exhibiting said trait. The present invention provides methods to establish putative associations between any of 
the biallelic markers described herein and any detectable trans, mcluding those specifically de^ibed herein. 

To use the maps and markers of the present invention in further targeted approaches, biaOelic markers which are 
in Mage disequilibrium with any of the above disclosed markers may be identified. In cases where one or more biaUeOc 
markers of the presen( invention have been shown to be associated with a detectable (rait, more biallelic markers in linkage 
disequilibrium with said associated biallelic markers may be generated and used to perform targeted approaches aiming at 
Identifying individuals exhibiting, or likely to exhibit said detectable trait, according to the me(hods provided herein. 

Furthermore, in cases where a candidate gene is suspected of being associatod with a particular detectable trait or 
suspected of causing the detectable trait, biallelic markers in linkage disequih'brium with said candidate gene may be 
identified and used in tarpeted approaches, such a* the approaches utilized above for the asthma-associated gene and the 
ApoE gene. 
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Biallelic markers that are hi Bnkaoe rfsequffibrium with markers associated with a detectable trait, or with genes 
associated with a detectable trait or suspected of being so, are identified by performing single marker analyzes, hnplotype 
association analyzes, or linkage disequilibrium measurements on samples from trait positive and trait negative individuals as 
described above using bialleiic markers lying in the vicinity of the target marker or gene. In this manner, a single biallelic 
marker or a group of biallelic markers may be identified which indicate Uiat an individual is likely to possess the detectable 
trait or doos possess the detectable trait as a consequence of a particular allele of the tanjet marker or gone 

Nucleic acid samples from individuals to bo tested for predisposition to a detectable trait or possession of a 
detectable trait as a consequence of a particular allele of the target gene may bo examined using the diagnostic methods 
described below. 

Diannnstlc Methods 

To use the maps and biallelic markers of the present invention to diagnose whethor an individual is predisposed to 
express a detectable trait or whether the individual expresses a detectable trait as a result of a particular mutation, one or 
more biallelic markers indicative of such a predisposition or causative mutation are identified by performing association 
^studies and haplotype analysis on affected and non-affected individuals as described above. 

The diagnostic techniques of the present invention may employ a variety of methodologies to determina 
whethor a test subject has a biallelic marker pattern associated with an increased risk of developing a detectable trait or 
whether the individual suffers from a detectable trait as a result of a particular mutation, including methods which 
enable the analysis of individual chromosomes for haplotyping, such as family studies, single sperm UNA analysis or 
somatic hybrids. 

Tho trait analyzed using the present diagnostics may be any detectable trait, including diseases, drug response, 
drug efficacy, or drug toxicity. A 'positive" drug response may refer to a response indicating either some drug efficacy 
or no drug toxicity. Diagnostics which analy2e drug response, drug efficacy, or drug toxicity may be used to determine 
whether an individual should be treated with a particular drug. For example, if the diagnostic indicates a likelihood that 
an individual will respond positively to treatment with a particular drug, the drug may be administered to the individual 
Conversely, if the diagnostic indicates that an individual is likely to respond negatively to treatment with a particular 
drug, an alternative course of treatment may be prescribed. A negative response may be defined as either the absence 
of an efficacious response or the presence of toxic side effects. 

Clinical drug trials represent another application for the maps and markers of tho present invention. One or 
more markers indicative of drug response, drug efficacy, or drug toxicity may be identified using tho techniques 
described above. Thereafter, potential participants in clinical trials of the drug may be screened to identify those 
individuals most likely to respond favorably to the drug and exclude those likely to experience side effects- In that way, 
the effectiveness of drug treatment may be measured in individuals who respond positively to the drug, without lowering 
the measurement as a result of the inclusion of individuals who are unlikely to respond postively in the study and 
without risking undesirable safety problems. 

In each of the diagnostic methods, a nucleic acid sample is obtained from the test subject and the biallelic 
marker pattern for one or more of the biallelic markers included in the maps of the present invention, indudino the B53 
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UMhfc markers obtained above (which include the sequences of SEO ID Nos. 1-50 and 51-100 or the sequences 
complementary thereto), the asthma-associated biallefic amtkmK the PG1 bhMa markers md ^ Apo £ ^ 
markers, including those of SEQ fO Nos. 301-305/307-311 or the sequences complementary thereto. In other 
embodiments, the biailelic marker pattern of one or more of the markers of SEQ ID Nos. 306 and 312 is determined in 
addition to doterminino the biaDelic marker pattern of one or more of the biailelic markers included in the maps of the 
present invention, including the 653 biaOelic markers obtained above (which include the sequences of SEQ ID Nos. 1-50 
and 51-100 or the sequences complementary thereto), the asthma-associated biailelic markers, the PG1 biailelic 
markers, and the Apo E biailelic markers. i„cIudin B those of SEQ ID Nos. 301-305/307-311 or th7 sequences 
complementary thereto. In some embodiments, the biallefic marker partem is determined by conducting an amplification 
reaction to generate ampKcons containing the polymorphic bases of the one or more biallefic markers to be gcnotyped 
The identies of the polymorphic bases of the one or more bialelic markers to be analyzed may be determined using a 
variety of methods, including hybridization assays which specifically detect amplification products containing parlicular 
alleles of the one or more biallefic markers, and microsequencing reactions which identify the polymorphic bases of the 
one or more biailelic markers to be anlnyzed. 

While the Mowing discussion utilizes the 653 biailelic markers obtained above (which include the sequences 
of SEQ 10 Nos. 1-50 and 51-100 or the sequences complementary thereto), the asthma-associated biallefic markers, the 
PG1 biallefic markers, and the Apo E biailelic markers as examples of the diagnostics of the present invention, it will be 
appreciated that the same diagnostics may be used in conjunction with any marker or any group of markers included in 
the maps of the present invention. 

Examples of amplification primers enabling tho amplification, from subjects genomic ONA samples, of DNA 
fragments that carry each of the marker, of SEQ ID Nos: 1-50 and 51-100 or the sequences complementary thereto, are 
oligonucleotides of SEQ ID NQs: 101-150 and 151-200; pairs of corresponding primers for a given biailelic marker may 
be reconstituted by choosing the adequate upstream oligonucleotide from SEQ ID Nos. 101-150 together with the 
corresponding downstream oligonucleotide from SEQ ID Nos: 151-200. 

SEQ ID Nos: 1-50 correspond to the sequence identification number for e first allele of the biailelic markers of 
SEQ ID Nos: 1-50 and 51-100 and SEQ ID Nos: 51-100 correspond to the sequence identification number for a second 
allele of the biallefic markers of SEQ ID Nos: 1-50 and 51-100. 

SEQ 10 Nos: 3 1 3-31 8 correspond to sequence identification numbers of upstream amplification primers 
that may be used to generate amplification products containing the polymorphic bases of the biallefic markers of 
respective SEO ID Nos: 301-306/307-312. SEQ 10 Nos: 319-324 correspond to downstream amplification primers that 
may be used to generate amplification products containing the polymorphic bases of the biailelic markers of respective 
SEQ ID Nos: 301-306/307-312. 

For all markers of SEQ ID Nos: 1-50/51-100 and 301-306/307-312 or the sequences complementary thereto, 
the enclosed listings indicate the position and identity of the polymorphic base in each biailelic marker. Potential 
microsequencing primers are also included in the sequence listing. The sequences of SEQ ID Nos. 201-250 may be used 
in microsequencing procedures such as those described herein to determine the sequence of the polymorphic bases of the 
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biaJlelic markers of SEQ ID Nos, 1-50/5M00. The sequences of SEQ ID Nos. 325-330 or 331-338 may be used in 
microsequencing procedures such as those described herein lo determine the sequence of the polymorphic bases of the 
biailolic markers of SEQ ID Nos. 301-306/307-31Z 

All Estings indicate the internal identification number corresponding to the biallelic marker lo which the listed sequence 
5 is related to. 

One 8spect of the present invention is a method for determining whether an individual is at risk of developing 
Alzheimer's Disease or whether an individual suffers from Alzheimer's Disease as a consequence of possessing the Apo E 
€4 site A allele. The method involves obtaining a nucleic acid sample from the individual and determining whether the 
nucleic acid sample contains one or mere markers indicative of a risk of developing Alzheimer's Disease or one or more 
10 markers indicative that the individual suffers from Alzheimer's Disease as a result of possessing the Apn E e4 site A 

allele. In one embodiment, the method comprises determining the identity of the polymorphic base of one or more 
biallelic markers selected from the group counting of SEQ ID Nos. 301-305/307-312 or the sequences complementary 
thereto in the nucleic acid sample. In a further embodiment, the method involves determining whether the nucleic acid 
sample contains the sequence of SEQ ID No. 30S (the C allele of marker 99-2452/54 containing the Apo E e4 site A 
15 allele) or the sequence complementary thereto. In a further embodiment the method comprises determining whether the 

nucleic acid sample contains SEQ ID No. 311 (the T allele of marker 99-365/344) or the soquence complementary 
thereto. In another embodiment, the method comprises determining whether the nucleic acid sample contains SEQ ID 
No. 311 (the T allele of markor 99-365/344) and SEQ ID No. 30S (the C allele of marker 99-2452/54 containing the Apo 
E site A allele) or the soquence complementary thereto. 
2D In still a further embodiment, the method comprises determining whether the nucloic acid sanplc contains SEQ 

ID No. 302, 301, 303, and 304 or the sequences complementary thereto. In still a further embodiment, the method 
comprises determining whether the nucleic acid sanple contains SEQ ID Nos. 302, 303, and 304 or the sequences 
complementary thereto. In a further embodiment the method comprises determining whether the nucleic acid sample 
contains SEQ ID No. 31 1 (the T allele of marker 99-365/344) or the sequence complementary thereto. 
25 ,n some embodiments, the step of determining the identity of tho polymorphic base of one or more biallelic 

markers selectod from the group consisting of SEQ ID Nos. 301-305 and SEQ ID Nos, 307-311 or the sequences 
complementary thereto in the nucleic acid sample comprises conducting an ampGfication reaction on said nucleic acid 
sample using one or more of the amplification primers setected from the group consisting of SEQ ID Nos. 313-317 and 
SEQ ID Nos- 319-323 and determining the identity of the polymorphic base in said one or more biallelic markers. 

In some embodiments, tho identity of the polymorphic base may be determined using one or more of the 
microsequencing primers listed as SEQ ID Nos. 325-329 or 331-335. In embodiments comprising the step of 
determining whether the nucleic acid sample contains the sequence of SEQ ID No. 30B r the method may comprise 
conducting an amplification reaction on the nucleic acid sample using the pair of amplification primers consiting of SEQ 
ID Nos. 318 end 324. In some embodiments, the step of detonmihing whether the nucleic acid sample contains the 
35 sequence of SEQ ID 306 comprises conducting a microsequencing reaction using one of the microsequencing primers 

listed as SEQ ID Nos. 330 or 336, 
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Another aspect of the present invention relates to a method of determining whether an individual is at risk of 
developing 3 trait or whether an individual expresses a trait as 3 consequence of possessing a particular trait-causing 
allele. Alternatively, another aspect of the present invention relates to a method of determining whether an individual is 
at risk of developino a plurality of traits or whether an individual expresses a plurality of traits as a result of possessing 
particular trait-causing alkies. Those methods involve obtaining a nucleic acid sample from the individual and 
determining whether the nucleic acid sample contains one or morn markers indicative of a risk of developing tho trait or 
one or more markers indicative that the individual expresses the trait as a result of pussessing a particular trait-causing 
allele. In one embodiment, the methods comprise determining the identity of the polymorphic base of one or more 
biaflelic markors in the maps of the present invention, including any of tho 653 bialleiic markers obtainbd above (which 
include the sequences of SEQ ID Mos. 1-50 and 51-100 or the sequences complementary thereto), the nsthma-associated 
bialleiic markers, the PG1 bialleGc markers, and the new Apo E bialleiic markers. In a further embodiment, the methods 
comprise determining the identities of the polymorphic bases of at least two, at least three, at least five, at least eight, 
at least 20. at least 100, at least 200, at least 30D, at least 400. between 400 and 2,000, between 2,000 and 4,000, 
between 4,000 and 10,000, between 10,000 and 20,000 or more than 20,000 of the bialleiic markers in the maps of 
the present invention, including any of the 653 bialleiic markers obtained above (which include the sequences of SEQ 10 
Was, 1-50 and 5M00 or the sequences complementary thereto), the asthma-associated bialleiic markers, the PG1 
bialleiic markers, and thB new Apo E bialleiic markers. 

In some embodiments, the step of determining the identity of the polymorphic base of one or more bialleiic 
markers in the maps of the present invention, including any of the 653 bialleiic markers obtained above (which include 
the sequences of SEQ ID Nos. 1-50 and 51-100 or the sequences complemontary thereto), the asthma-associated 
biallolic markers, the PG1 bialleGc markers, and the new Apo E bialleiic markers, comprises conducting an amplification 
reaction on said nucleic acid sample using appropriate amplification primers and determining tho identity of the 
polymorphic base in said one or more bialleiic markers. In some embodiments, the identity of the polymorphic base may 
be determined usiiig appropriate microsequencing primers. 

As described herein, the diagnostics may be based on a single bialleiic marker or a group of bialleiic markers. 
Without wishing to be limited to any particular value, it is preferred that the bialleiic marker used in single marker 
diagnostics either as a positive basis for further diagnostic tests or as a preliminary starting point for early preventive 
therapy, exhibit a p value in prefiminary screening association analyzes of about 1 x 1Q' 2 or less. More preferably the p 
value is about 1 x 1 0^ or tess. 

Similarly, without wishing to be limited to any particular value for diagnostics based on more than one bialleiic 
marker, it is preferred that the haplotype exhibit a p value of 1 x Iff 3 or less, still more preferahly 1 x 1 0 6 or less and 
most preferably of about 1 x 10 s or less in a preliminary screening haplotype analysis. These values are believed to be 
applicable to any association studies involving single or multiple marker combinations. Significance thresholds may be 
refined according to the methods previously described. 

Example 32 describes methods for determining the bialleiic marker pattern in a nucleic acid sample. 
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Examole 32 

A nudeic acid sample i s obtained from an individual to be tested for susceptibility to a detectable trait or for a 
detectable trait caused by a particular mutation. The nucleic acid sample may bo a RNA sample or a DMA sample. 

A PCR amplification is conducted using primer pairs which B enerate amplification products containing the 
polymorphic nucleotides of one nr more biallefic markers associated with such a predisposition or causative mutation. 
For example, the ampliation products may contain the polymorphic bases of one or more of the biallelic ma,kcrs in the 
maps of the prosent invention. including any of the 653 biallelic markers obtained above (which include the sequences of 
SEQ ID lbs. 1-50 and 51-100 or the sequences complementary thereto), the asthma-assuciated biattefic makers, the 
PG1 biallelic markers, and the Apo E biallelic markers or biallelic markers in linkage disequilibrium with any of these 
biallefic markers. In some embodiments, the PCR amplication is conducted usinrj primor pairs which D cncrate 
amplification products containino the polymorphic nucleotides of several biallefic markers. For example, in one 
embodiment, amplification products containino the polymorphic bases of one or more biallefic markers in the maps of the 
present invention, including any of the 653 biallelic markers obtained above (which include the sequences of SEQ 10 
Nos. 1-50 and 51-100 or the sequences complementary thereto), the asthma-associated biallelic markers, the PG1 
biallelic markers, and the Apo E biallelic markers, biallelic markers which are in linkage disequilibrium therewith or with a 
causative mutation associated with a detectable phenotype may be generated. In another embodiment, amplification 
products containing the polymorphic bases of five or more biallelic markers in the maps of the present invention, 
including any of the the 653 biallelic markers obtained above (which include the sequences of SEQ 10 Nos. 1-50 and 51- 
100 or the sequences complementary thereto), the asthma-associated biallelic markers, the PG1 biallelic markers, and 
the Apo E biallelic markers, biallelic markers which are in linkage disequilibrium therewith or with a causative mutation 
associated with a detectable phenotype may be generated. In another embodiment, amplification products containing the 
polymorphic bases of 20 or more biallelic markers in the maps of the present invention, including any of the 653 biallelic 
markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 51-100 or the sequences complementary 
thereto), the asthma-assodated biallelic markers, the PG1 biallelic markers, and the Apo E biallelic markers, biallelic 
markers which are in linkage disequilibrium therewith or with the causative mutation may be generated. In another 
embodiment, ampfification products containing the polymorphic bases of 1 00 or more biallelic markers in the maps of the 
present invention, including any of the the 653 biattefic markers obtained above (which include the sequences of SEQ 10 
Nos. 1-50 and 51-100 or the sequences complementary thereto), the asthma-associated biallelic markers, the PG1 
biallelic markers, and the Apo E biallefic markers, biallelic markors which are in linkage disequilibrium therewith or with a 
causative mutation associated with a detectable phenotype may be generated. In another embodiment, amplification 
products containing the polymorphic bases of 200 or more biallelic markors in the maps of the present invontion, 
induding any of the the 653 biallelic markers ebtained above (which include the sequences of SEQ ID Nos. 1-50 and 51- 
100 or the sequences complementary thereto), the asthma-assocratod biallefic markors, the PG1 biallefic markers, and 
the Apo E biallefic markers, biallelic markers which are in linkage disequilibrium therewith or with a causative mutation 
associated with a detectable phenotype may be generated. In another embodiment, amplification products containing the 
polymorphic bases of 300 or more biallelic markers in the maps of the present invention, includinn anv of the 653 
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MalleCe markors obtained above (which include the sequences of SEO ID Nos. 1-50 and 5M00 or the sequences 
complementary thereto), the asthma-associated biafleiic markers, the PG1 biallelic markers, and the Apo E biallelic 
markors, biafleiic markers which arc in linkage disequilibrium there%vith or with the causative mutation may be 
generated. In another embodiment, amplification products containing the polymorphic bases of 400 or more biallelic 
markers in the maps of the present invention, including any of the the G53 biallelic markers obtained above (which 
include the sequences of SEO ID Nos. 1-50 and 5M00 or the sequences complementary thereto), the asthma-associated 
biallelic markors, the PG1 biafleiic markers, and the Ape E biallelic markers, biallelic markers which are in linkage 
disequilibrium therewith or with e causative mutation associated with a detectable phenotypc mey be generated. 

The primers used to generate the amplification products may be designed as described herein. Representative 
amplification primers for generating amplification products containing the polymorphic bases of the biallelic markers of 
SEQ ID Nos. 1-50 and 51-100 are provided as SEQ ID Was. 10 1-1 50/1 5 1-200 in the accompanying Sequence Listing. 
The PCR primers may be oligonucleotides of 10. 15, 20 or more bases in length which enable the amplification of the 
polymorphic site in the markers. In some embodiments, the amplification product produced using these primers may be 
at least 100 bases in length (i.e. about 50 nucleotides on each side of the polymorphic base). In other embodiments, the 
ampGfication product produced using these primers may be at least 500 bases in length (i.e. about 250 nucleotides on 
each side of the polymorphic base). In still further embodiments, the amplification product produced using these primers 
may be at least 1000 bases in length (Le. about 500 nucleotides en each side of the polymorphic base). 

Table 9 lists the internal identification numbers of the 50 localized markers described herein and the Apo E 
markers described herein, the SEQ 10 Nos. for each of the two alleles of these biallelic markers, the SEO ID Nos. of 
representative upstream and downstream amplification primors which can bo used to generate amplification products 
including the polymorphic bases of these biafleiic markers, and the SEQ ID Nos ol microsaquencing primers which can be 
used to determine the identies of the polymorphic bases of these markers. 

Table 10 

Marker SEQIOMos SEQ ID Nos SEQIDWos 

(Gensetcode) First Second Amplification primers Microsaquencing primars 

allele allclo Upstream Downstream 1 2 
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99-2647 49 99 149 199 249 299 

99 ' 2649 50 100 150 . 200 250 300 

It will be appreciated that the primers feted in Table 9 are merely exemplary and that any other set of primers 
which produce amplification products containing the polymorphic nucleotides of one or more of ihe biaflelic markers of 
SEQ ID Nos: 1-50 and 5MO0 or biaMc markers in linkage disequilibrium therewith or with a causative mutation for a 
detectable trail or a combination thereof may be used in the diagnostic methods. It will also be appreciated that these 
diagnostic methods may be performed with any biallelic marker or combination of biallelic markers included in the maps 
of the present invention. 

Following the PCH amplification, the identities of the polymorphic bases of one or more of the biallelic markers 
in the nucleic acid sample are determined. The identities of the polymorphic bases may be determined using the 
microsogueiicino procedures described in Eiample 13. It will be appreciated that the niicrosequencing primers listed as 
SEQ 10 NOs: 201-250 and 251-300 are merely exemplary and that any primer having a 3* end near the polymorphic 
nucleotide, and preferably immediately adjacent to the polymorphic nucleotide, may be used. Similarly, it will be 
appreciated that microsequencing analysis may be performed for any marker or combination of markers in the maps of 
the present invention. 

Alternatively. the microsequencing analysis may be performed as described in Pastinen et aL, Genome 
Research 7:606-6t4 (1997| r the disclosure of which is incorporated herein by reference, and which is described in more 
detail below. 

Alternatively, the PCR product may be completely sequenced to determine the identities of the polymorphic 
bases in the biallelic markers. In another method, the identities of the polymorphic bases in the biallelic markers are 
determined by hybridizing the amplification products to microarrays containing allele specific oligonucleotides specific 
for the polymorphic bases in the biaffelic markers. The use of microarrays comprising allele specific oligonucleotides is 
described in more detail below. 

It will be appreciated that the identities of the polymorphic bases in the biallelic markers may be determined 
using techniques other than those listed above, such as conventional dot blot analyzes. 

Nucleic acids used m the above diagnostic procedures may comprise at least 10 consecutive nucleotides, 
including the polymorphic bases, of the biaflelic markers in the maps of the present invention, including any of the 653 
biallelic markers obtained above (which include the sequences of SEQ 10 Nos. 1-50 and 5 MOO or the sequences 
complementary thereto), the asthma-associated biallelic markers, the PG1 biallelic markers, and the new Apo E biallelic 
markers, including those of SEQ ID Nos. 301-305/307-31 1 or the sequences complementary thereto. Alternatively, the 
nucleic acids used in the above diagnostic procedures may comprise at least 15 consecutive nucleotides, including the 
polymorphic bases, of the biallelic markers In the maps of the present invention, including any of the 653 biallelic 
markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 5M00 or the sequences complementary 
thereto), the asthma-associated biallefic markers, the PG1 biallelic markers, and the new Apo E biallelic markers, 
including those of SEQ ID Nos. 301-305(307-311 or the sequences complementary thereto. In some embodiments, the 
nucleic acids used in the above diagnostic procedures may comprise at least 20 consecutive nucleotides, including the 
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polymorphic bases, of the biallefic markers in the maps of the present invention, including any of the 653 biallefic 
markers obtained above (which include the sequences of SEQ ID No*. 1-50 and 51-100 or the sequences complementary 
thereto), the asthma-associatod biafielic markers, the PG1 biallclic markers, and the new Apo E bialielic markers, 
rnckidhq, those of SEQ ID Nos. 301-305/307-31 1 or the sequences complementary thereto. In still other embodiments, 
the nucleic acids used in the above diagnostic procedures may comprise at least 30 consecutive nucleotides, including 
the polymorphic bases, of the biaUelic markers in the maps of the pesent invention, including any of the 653 biallclic 
markers obtained above (which include the sequences of SEQ ID Nos. 150 and 51 -1 00 or the sequences complementary 
theretol, (ho aslhma-associatod biallclic markers, the PG1 biaBelic markers, and the new Apo E biallolic markers, 
including those of SEQ ID Nos. 301-305/307-31 1 or the sequences complementary thereto. In further embodiments, the 
nucleic acids used in the above diagnostic procedures may comprise more than 30 consecutive nucleotides, including the 
polymorphic bases, of the btaJleBc markers in the maps of the present invention, including any of the the 653 bialielic 
markers obtained above (which include the sequohces of SEO ID Nos. 1-50 ami 51 -100 or the sequences complementary 
thereto), the asthma-associated bialielic markers, the PG1 biallolic markers, and the new Apo E bialielic markers, 
including those of SEO ID Nos. 301-305/307-31 1 or the sequences complementary thereto. In still further embodiments, 
the nucleic acids used in the above diagnostic procedures may comprise the entire sequence of the biallefic markers in 
the maps of the present invention, including any of the the 653 bialielic markers obtained above (which include the 
sequences of SEQ ID Nos. 1-50 and 51-100 or the sequences complementary thereto), the asthma-associated biallefic 
markers, the PG1 biallolic markers, and the new Apo E bialielic markers, including those of SEQ ID Nos. 301-305/307- 
31 1 or the sequencos complementary thereto. In some embodiments the nucleic acids used in the diagnostic procedures 
are longer than the sequences of SEQ ID Nos. 1 50. 51-100. 301-305 and 307-11 because they contain nucleotides 
adjacent to these sequences. 

The diagnostics of the present invention may also employ nucleic acid arrays attached to DNA chips or any 
other suitable soDd support, including beads. As used herein, the term array means a one dimensional, two dimensional, or 
multidimensional arrangement of a plurality of nucleic acids of sufficient length to permit specific detection of nucleic acids 
capable of hybridizing thereto. 

DNA chips allow the integration of micro-biochemical processes (such as DNA hybridization), systems of signal 
detection (such as fluorescence) and data processing into a single system which can be usod to obtain information on 
polymorphism. The solid surface of the chip is often made of silicon or glass but it can be a polymeric membrane. 
Efficient access to polymorphism information is obtained through a basic structure comprising high density arrays of 
oligonucleotide probes attached to a solid support (the chip) at selected positions. The immobilization of arrays of DNA 
probes on solid supports has been rendored possible by the development of a technology generally identified as 'Very 
Urge Scale Immobilized Polymer Synthesis' (VLSIPS») and in which, typically, probes are immobilized in a high density 
array on 3 solid surface of a chip. Examples of VLSIPS™ technologies are provided in US Patents 5,143,854 and 
5.412,087 and in PCT Publications WO 90/15070. WO 92/10092 and WO 95/11995, the disclosures of which are 
incorporated herein by reference, which describe methods for forming oligonucleotide arrays through techniques such as 
Hunt-directed synthesis techniques. 
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In designing strategies aimed at providing arrays of nucleotides immobilized on solid supports, further 
presentation strategies were developed to order and display the probo arrays on tho chips in an attempt to maximize 
hybridization patterns and sequence information. Examples of such presentation strategies are disclosed in PCT 
Publications WO 94/12305. WO 94/1 1530, WO 97/29212 and WO 97/3125G, the disclosures of which are incorporated 
5 herein by reference. 

Each DMA chip can contain thousands to millions of individual synthetic ONA probes arranged in a grid-like 
pattorn and miniaturized to the size of a dime. 

The chip technology has been successfully used to detect mutations in numerous coses. Fur example, the 
screening of mutations has been undertaken in the BRCAl gene, in S. caravisiae mutant strains, and in the protease 
10 gene of HlV-1 virus (see llacia ct aL, Mat. Genet. 14:441-447(1996); Shoemaker et al., Mat Genet. 14:450456 (1996); 

\ Kozn{ ct aL ' N * u M&d . 2:753-759 (1 996), the disclosures of which are Incorporated herein by reference). At least three 
companies proposo chips ahlo to detect biallefc polymorphisms: Affymetrix (GeneChip). f lyseq (HyChip end HyGnostics), 
and Protogene Laboratories. 

In some embodiments, tlie efficiency of hybridization of nucleic acids in the sample with the probes attached to 
15 the chip may bo improved by using polyacrylamide gel pads isolated from one another by hydrophobic rogions in which 

the DNA probes are covalently linked to an acrylamide matrix. 

The polymorphic bases present in the biallelic marker or markers of the sample nudeic acids are determined as 
follows. Probes which contain at least a portion of one or more of the biallelic markers of the present invention are 
synthesized either in situ or by conventional synthesis and immobilized on an appropriate chip using methods known to 
20 the skilled technician. 

The nucleic acid sample which includes the candidate region to be anafyzed is isolated, amplified with primers 
capable of generating an amplification product containing the polymorphic bases of one or more bialloJic markers, and 
labeled with a reporter group. The reporter group can be a fluorescent group such as phycoerythrin. The labeled nucleic 
acid is then incubatod with the probes immobilized on the chip using a f luidics station. For example, Manz et al. 14k* in 

25 Ch/omatogr. 33:1-66 (1993). the disclosure of which is incorporated herein by reference) describe the fabrication of 

fluidics devices and particularly microcapillary devices, in silicon and glass substrates. 

After the reaction is completed, the chip is inserted into a scanner and patterns of hybridization are detected. 
The hybridization data is collected as a signal emitted from the reporter groups already incorporated into the nucleic 
acids generated in the amplification of the sarnple DNA, which is now bound to the probes attached to the chip. Probes 

30 that perfectly match a sequence of the nucleic acid sample generally produce stronger signals than those that have 

mismatches. Since the sequence and position of each probe immobilized on the chip is known, the identity of tho nucleic 
add hybridized to a given probe can be determined. 

For singie-nucleotido polymorphism analyzes, sets of four oligonucleotides are generally designed (one for each 
possible base) that span each position of a portion of the candidate region found in the nucleic acid sample, differing only 

35 in the identity of the central base. The relative intensity of hybridization to each series of probes at a particular location 

allows the identification of the base corresponding to the central base of the probe. For example, to detect sinqle 
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nucieotide polymorphisms such as those in the present biallelic markers, oligonucleotides having each of the two allelic 
bases at their central position are affixed to tho chip. The amplification products resulting from amplification of the 
nucleic acids in the sample ore hybridized to the chip under high stringency (at lower salt concentration and higher 
temperature over shorter time periods) to facilitate specific detection of tho polymorphic sequences present in tho 
5 nucleic acid sample. 

The use of direct electric field control improves the determination of singlo base mutations (Namijjen). A 
positive field increases the transport rate of negatively charged nucleic acids and results in a 10-fold increase of the 
fiybridization rates. Using this technique, single base pair mismatches are detocted in less than 15 sue (sec Sosnowski et 
at, Proc NalL Acad Scl USA 94:1 1 19-1 123 (1 997}, tfie disclosure of which is incorporated herein by reference). 
10 Another technique which can be used tu analyze polymorphisms includes multicamponent integrated systems 

which miniaturize and compartmentalize processes such as restriction enzyme digestion, PCR reactions, and capillary 
electrophoresis in a single functional device. An example of such technique is disclosed in US patent 5,589,136. the 
disclosure of which is incorporated herein by reference, which concerns the integration of PCR amplification and 
capillary electrophoresis in chips. Integrated systems are best applied with microftuidic systems. These systems 
15 comprise a pattern of microchannels designed onto 3 glass, silicon, quartz, or plastic wafer included on a microchip. The 

movements of the samples are controlled by electric forces applied across different areas of tho microchip to create 
functional microscopic valves and pumps with no moving parts. Regulating or varying the voltage controls the liquid flow 
at intersections between the micro-machined channels and changes the liquid flow rate for pumping across different 
sections of the microchip. 

20 ,n tn8 cas0 of biallelic marker analyzes, the micro-chip integrates nucleic acid amplification, a microsequencing 

roaction (such as the one described above), capillary electrophoresis and a detection method such as laser-induced 
fluorescence detection. 

In a first step, the DNA samples are amplified, preferably by PCR. Then, the amplification products are 
subjected to automated microsequencing reactions using ddNTPs (specific fluorescence for each ddNTP) and the 

25 appropriate oligonucleotide microsequencing primers which hybridize just upstream of the targeted polymorphic base. 

The microsequencing reactions may employ primers capable of being extended to the polymorphic bases of tho biallelic 
markers. Preferably, the microsequencing primers comprise a sequence terminating at the base immediately preceding 
the polymorphic base of tho biallelic markers. Once the extension at the 3' end is completsd, tho primers are separated 
from the unincorporated fluorescent ddNTPs by capillary electrophoresis. The separation medium used in capillary 

30 electrophoresis can for oxampfe be potyacrylamide, polyethylenoglycol or dextran. The incorporated ddNTPs in the single- 

nucleotide primer extension products are identified by fluorescence detection. Preferably, the micro-chip can be used to 
process at least 96 samples in parallel More preferably, the micro-chip can be used to process at least 384 samples in 
parallel. Preferably, the microchip is designed for use with detection procedures using four color laser induced 
fluorescence detection of the ddNTPs. 

35 An Y one or more alleles of the biallelic markers in the maps of the present invention, or fragments thereof 
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containing the polymorphic bases, may be fixed to a solid support such as a microchip or other immobilizing surface. The 
fragments of these nucleic adds may comprise at least 10. at least 15, at least 20, ot least 25, or more than 25 
consecutive nucleotides of the biallelic markers described herein. Preferably, the fragments include the polymorphic bases of 
the biaDeSc markers. 

A nueloic add sample is applied to the immobilizing surface and analyzed to determine the niontics of the 
polymorphic bases of one or more of the biaBcDc markers. In some embodiments, the solid support may also indude one or 
more of the amplification primers described herein, or fragments comprising at least 10, at least 15, or at least 20 
consecutive nucleotides thereof, for generating an amplification product containing tlie polymorph bases of the biallelic 
markers to be analyzed in the sample. 

Another embodiment of the present invention is a soGd support which includes one or more of the microscquencitig 
primers listed as in the accompying Sequence listing, or fragments comprising at least 10, at least 15, or at least 20 
consecutive nucleotides ihereof and having a 3* terminus immediately upstream of the polymorphic base of the 
corresponding biallelic marker, for determining the identity of the polymorphic base of the one or more bialfelic markers fixed 
to the solid support 

For example, one embodiment of the present invention is an array of nucleic acids fixed to a solid suppoit. such as 
a microchip, bead, or other immobaizing surface, comprising one or more of the biaflelic markers in the maps of the present 
invention or a fragment comprising at least 10. at least 15. at least 20, at least 25, or more than 25 consecutive nucleotides 
thereof including the polymorphic base. For example, the array may comprise one or more of any of the 853 biallelic 
markers obtained above (which indude the sequences of SEQ 10 Nos. 1-50 and 51-100). the asthma-associated biallelic 
markers, the PGt biaflelic markers, and the new Apo E biallelic markers (inclucfag SEQ ID Nos. 301-305/307-31 1) or the 
soquences complementary thereto, or a fragment comprising at least 10, at least 15, at least 20, at least 25. or more than 
25 consecutive nucleotides thereof including the polymorphic base. In a further embodiment, the array comprises at least 
frve of the biaflofb markers in the maps of the present invention or a fragment comprising at least 1 0, at least 1 5, at least 
2a at least 25, or more than 25 consecutive nucleotides tliereof induding the polymorphic base. For example, the arrays 
may comprise at least five of any of the 653 biallefic markers obtained above (which include the sequences of SEQ ID 
Mos. 1-50 and 51-100), the asthma-assodated biallelic markers, the PG1 biaflelic markers, and the now Apo E biaflelic 
markers {including the sequences of SEQ ID Nos. 301-305/307-31 1} or the sequences complementary thereto, or a 
fragment comprising at least 10, at least 15, at least 20, at least 25, or more than 25 consecutive nucleotides thereof 
induding the polymorphic base. In a further embodiment the array comprises at least 10 of the biallelic markers in the 
maps of the present invention or a fragment comprising at least 10. at least 15, at least 20, at least 25, or more than 25 
consecutive nucleotides thereof including the polymorphic base. For example, the array may comprise at least 1 0 of any of 
the B53 biallelic markers obtained above (which include the sequences of SEQ 10 Nos. 1-50 and 51-100), the asthma- 
associated biallelic markers, the PE1 biallelic markers, and the new Apo E biallelic markers (including the sequences of 
SEQ 10 Nos. 30 1-305/307-31 1) or the sequences complementary thereto, or a fragment comprising at least 10, at least 15, . 
at least 20. at least 25, or more than 25 consecutive nucleotides thereof induding the polymorphic base. In a further 
embodiment the array comprises at least 20 of the biallelic markers in the maps of the present invention or a fraoment 
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comprising at least 15 consecutive nucleotides thereof including the polymorphic base. For example, (he array may comprise 
at least 20 of any of the 653 biallelic markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 51- 
10D), the asthma-associated bralleffc markers, the PG1 biallelic markers, and the new Apo E biallelic markers (including 
the sequences of SEQ ID Nos. 301-305/307-311) or the sequences complementary thereto, or a fragment comprised at 
5 laast 10, at least 15. at least 20, at least 25. or more than 25 consecutive nucleotides thereof including the polymorphic 

base. In a further embodiment the array comprises at least 100 of the biaDcUc markers in the maps of the present 
invention or a frarjment comprising at least 10, at least 15, at least 20. at least 25. or more than 25 consecutive nucleotides 
thereof including the polymorph base. For example, the array may comprise at loast 100 of any of the 053 biallelic 
markers obtained above [which include the sequences of SEQ ID Nos. 150 and 51-100), the asthma-associated biallelic 
10 markers, the PG1 biallelic markers, and the new Apo E biallelic markers (inchidin 0 the sequences of SEQ ID Nus. 301- 

305/307-311) or the sequences complementary thereto, or a fragment comprising at least 10, at least 15, at least 20. at 
least 25, or more than 25 consecutive nucleotides thereof including the polymorphic base. In a further embodiment the 
array comprises at least 200 of the biallelic markers in the map* of the present invention or a fragment thereof comprising 
at least 10. at least 15, at least 20. at least 25, or more than 25 consecutive nucleotides thereof including the polymorphic 
1 5 base. For example, the array may comprise at least 200 of any of the 653 biallelic markers obtained above (which include 

the sequences of SEQ ID Nos. 150 and 51-100), the asthma-associated biallelic markers, the PG1 biallelic markers, and 
the new Apo E biallelic markers (induing the sequences of SEQ ID Nos. 301-305/307-311) or the sequences 
complementary thereto, or a fragment comprising at least 10, at least 15, at loast 20, at loast 25. or more than. 25 
consecutive nucleotides thoreof including the polymorphic base. In a further embodiment the array comprises at least 300 
20 of the biallelic markers in the maps of the present invention or a fragment comprising at least 1 0, at least 1 5. at laast 20, at 

least 25. or more than 25 consecutive nucleotides thereof including the polymorphic base. For example, the array may 
comprise at least 300 of any of the 653 biallelic markers obtained above (which include the sequences of SEQ 10 Nos. 1- 
50 and 51-100), the asthma-associated biallelic markers, the PG1 biallelic markers, and the new Apo E biallelic markers 
(including the sequences of SEQ ID Nos. 301-305/307-311) or the sequences complementary thereto, or a fragment 
25 comprising at least 10, at least 15. at loast 20, at least 25, or more than 25 consecutive nucleotides thereof including the 

polymorphic base. In a further embodiment the array comprises at least 400 of the biallelic markers in tha maps of the 
present invention or a fragment comprising at least 10, at least 15, at least 20, at least 25, or more than 25 consecutive 
nucleotides thereof including the polymorphic base. For example, the array may comprise at least 400 of any of the 653 
biallelic markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 51-1 00), the asthma-associated 
30 biallelic markers, the PG1 biallelic markers, and the new Apo E biallelic markers (including the sequences of SEQ 10 Nos. 

301-305/307-31 1) or the sequences complementary thereto, or a fragment comprising at least 1 0. at least 15, at least 20, 
at least 25, or more than 25 consecutive nucleotides thereof including the polymorphic base. In a further embodiment the 
array comprises more than 400 of the bialleCc markers in tha maps of the present invention or a fragment comprising at 
least 10, at least 15, at least 20, at loast 25, or more than 25 consecutive nucleotides thereof including the polymorphic 
35 base. For example, the array may comprise at least 400 of any of the 653 biallelic markers obtained above (which include 

the sentences of SEQ ID Nos. 1-50 and 51-100), the asthma-associated biallelic markers, the PG1 biallelic markers, and 
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the new Apo E bialleiic markers rmcluding the sequences of SEQ ID Nos. 301-305/307-311) or the sequences 
complementary thereto, or a fragment comprising at least 10, at least 15. at least 20, at toast 25, or more than 25 
consecutive nucleotides thereof including the polymorphic base. Each of the embodiments listed above may also include one 
or more of the sequences of SEQ ID Nos. 30S and 312 in addition to those enumerated above. 

Another embodhierit of the present invention is an array comprising amplification primers for generating 
amplification products containing the polymorphic bases of one or more, at least five, at least 10, at least 20, at least 100. 
at least 200, at least 300, at least 400, or more than 400 of the IiiaOcBc markers in the maps of the present invention. For 
example, the array may comprise ampKIication primors for generating amplification products containing the polymorphic 
bases of one or more, at least five, at least 10. at loast 20, at least 100. at Jeast 200, at least 300, at least 400, or more 
than 400 of any of the 653 biailolic markers obtained above (which include the sequences of SEQ ID Nus. 1-50 and 51- 
100 or the sequences complementary thereto), the asthma-associated biailolic markers, the PGI hiallelic markers, and 
the new Apo E bialleiic markers (including the sequences of SEQ ID Nos. 301-305/307-311 or the stquencas 
complementary thereto). In such arrays, the amplification primers included in the array arc capable of amplifying the 
bialleiic marker sequences to be detected in the nucleic acid sample applied to the array (Le. the amplification primers 
correspond to the bialleiic markers af fired to the array). For example, if the array is designed to detect the bialleiic marker of 
SEQ ID Nos. 1 and 51 it may also contain SEQ ID Nos. 101 and 151, the amplification primers capable of generating an 
ampficon which includes sequence 10 Nos. 1 and 51. Thus, the arrays may include one or more of the amplification primers 
of SEQ ID Nos. 101-200. 313-317, and 319-323 corresponding to the one or more bialleiic markers of SEQ ID Nos. 1-50, 
51-100. 301-305, and 307-311 which are included in the array. In other embodiments, the arrays may include 
amplification primers capable of generating an amplification product which includes the bialleiic markers SEQ ID Nos. 
306 and 312 in addition to ampfification primers capable of generating an amplification product containing each of the 
markers enumerated above. Thus, in such embodiments, the arrays may further include the amplification primers of SEQ 
ID Nos. 31B amJ324. 

Another embodiment of the present invention is an array which includes mtcrosequancing primers capable of 
determining the identity of the polymorphic bases one or more, at least five, at least 10. at least 20, at least 100, at least 
200, at least 300. at least 400, or more than 400 of the bialleiic markers in the maps of the present invention. For 
example, the array may comprise imcrosequencing primers capable of determining the identity of the polymorphic bases of 
one or more, at least five, at least 10. at least 20. at feast 100, at least 200. at least 300, at least 400. or more than 400 
of the 653 bialleiic markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 51-100 or the 
sequences complementary thereto), the asthma-associated bialleiic markers, the PG1 bialleiic markers, and the new Apo 
E bialleiic markers (including the sequences of SEQ ID Nos. 301-305/307-31 1 or the sequences complementary thoroto). 
The sequences of representative microsequencing primers which may be included in the array are listed in the sequence 
listing as SEQ ID Nos. 201-300, 325-329, and 331-335. In other embodiments, the arrays may further include 
microsequencing primers for determining the identity of the polymorphic bases of one or more of the sequences of SEQ 
FO Nos. 306 and 3 12. such as the microsequencing primers of SEQ ID Nos. 330 and 336. 

Arrays containing any combination of the above nucleic acids which permits the specific detection or 
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identffication of the polymorphic bases of the Malta* markers in the maps of the present invention, including any 
combination of the 653 hiaffclic markers obtained above (which include the sciences of SEQ ID Nos. 1-SO and 51-100 
or the sequences complementary thereto), the asthma-associated biallelic markers, the PG1 biallelic markers, and the 
new Apo E Mate markers (including the sequences of SEQ ID Nos. 301-305/307-31 1 or the sequences complementary 
thereto) are also within the scope of the present invention. Other embodiments of the arrays include nucleic acids whid, 
permit the specific detection or identification of the polymorphic bases of one or more of SEQ ID Nos. 306 and 312 in 
addition to the nucleic acids permitting the specific detection or identical of the polymorphic bases of the biaUelic 
markers feted in the preceding sentence. For example, the array may comprise both the biallelic markers and 
amplification primers capable of generating amplification products containing the polymorphic bases of the biallelic 
markers. Alternatively, the array may comprise both amplification primers capable of generating amplification products 
containing the polymorphic bases of the biallelic markers and microsequonring primers capable of deteminin 0 the 
identities of the polymorphic bases of these markers. 

Although the above examples describe arrays comprising specific groups of biallelic markers and. in some 
embodiments, specific amplification primers and microsequencing primers, it will be appreciated that the present 
invention encompasses arrays including any biallelic marker, group of biallelic markers, amplification primer, group of 
amplification primers, microsequencing primer, or group of amplification primers described herein, as well as any 
combination of the preceding nucfoic adds. 

Alternatively, the microsequencing procedures described above may be used to determine whether an individual 
possesses a pattern of biallelic marker alleles associated with a detectable trait. In this approach, a PCR reaction is 
performed on the ONA or RNA of the individual to be tested to amplify the desired biallelic markers or portions thereof. The 
amplification product is hybridized to one or more oligonucleotides having their 3' end one base from the position of the 
polymorphic bases of the biatteOe markers which are fixed to a surface. The oligonucleotides are extended one base using a 
detcctably labeled dNTP and a polymerase. Incorporation of a pattern of detectably labelod bases indicative of a biallelic 
marker pattern associated with a detectable wait indicates that the individual suffers from a detectable trait as the result of 
a particular mutation or that the individual is at risk for developing the detectable trait at a subsequent time. 

In addition to their use in diagnostic techniques such as those described above, any of the arrays described above 
may also be used to identify a haplotype (La. a set of alleles of biallelic markers) which is associated with a particular trait 
As described above, in such analyses nucleic acid samples are obtained from trait positive and trait negative individuals and 
the alleles of bialleBc markers present in each population are determined to identify a haplotype which is statistically 
associated with the trait. The arrays may be employed in haplotype analyses as follows. Nucleic acid samples obtained 
from trait positive and trait negative individuals are amplified with primors capable of gonorating amplification products 
which include the polymorphic bases of tho biaHelic markers. The amplification products are labeled with a reporter group 
and allowed to contact the biallelic marker probes which are attached to the support As described above, the biallelic 
marker probes to which the labeled amplification products specifically hybridize are determined to indicate which alleles ol 
the biallelic markers are present in the samples. The patterns of alleles of biallelic markers in the trait positive and trait 



990403aA2_f_> 



WO 99/04038 



PCT/IB98/01193 



-85- 

negative mdhriduats are than toe/mined to identify a hapfotype having a statistically significant association with the traiL 

Alternatively, as described above, the nucleic acid samples from trait positive and ln.it negative individuals may be 
applied to an anay comprising amplification primers capable of generating amplification products which include the 
polymorphic bases of the biaMc markers. The identities of the polymorphic bases in the amplification products are then 
5 determined using techniques such as the microsoguencing procedures disclosed herein. Alternatively, amplification can ba 

conducted in liquid phase and microsequencing may be conducted on the anay. 

Alternatively, both amplication and microsequencing reactions may be pcrfonned in liquid phase. In sued 
embodiments, the labeled nucleotides incorporated in the microsoquendng primers during the microsotruenrina reactions are 
detected by hybridizing the extended microsequencing primers to sequences complementary to the microsequencing primers. 
10 The sequences complementary to the microsequencing primers are knmobifced on a support, such as those described above. 

The amplification and microsequencing reactions performed in liquid phase may be multiplexed, aflowing the samples to be 
tested simultaneously for tens, hundreds, thousands or more biallelic markers. 

Preferably, the array used in the haplotype analysis comprises one or more groups of biallelic markers known to be 
located in proximity to one another in the genome. For example, the biallelic markers in the groups may be derived from a 
15 single YAC insert a single BAG insert or a BAC subclone. Alternatively, the biallelic markers in the groups may be dorived 

from adjacent ordered clones. The biaflofic markers in the groups may be located within a genomic region spanning less than 
1 kb. from 1 to 5kb. from 5 to lOkb. from 10 to 25kb. from 25 to 50kb. from 50 to 150kb. from 150 to 250kb. from 250 to 
500kb, from 500kb to 1Mb. or more than 1Mb. In some embodiments, the biallelic markers in the groups comprise uinOefic 
markers which have been localized to the same chromosome, subchromosomal region, or gene. 

It will be appreciated that the ordered DNA containing the biallelic markers need not completely cover the genomic 
regions of these lengths but may instead be incomplete contigs having one or more gaps therein. 

In some embodiments, the biaUofic markers known to be located in proximity to one another in the genome may be 
located in physical proximity on the array. For example, the array may comprise one or more groups of at least 3 biollelic 
markers known to be located in proximity to one another in the genome. In some embodiments, the array may comprise one 
or more groups of at least 6 biallelic markers known to be located in F ox?mity t0 one anothEr ^ th8 momL ,„ othcf 
embodiments, the array may comprise one or mom groups of at least 20 biallelic markers known to be located in proximity 
to one another in the genome. 

The array may comprise one or more groups of biaMc markers known to be located on the same subchromosomal 
region. For example, the array could comprise two or more biaUefic markers located at 21q11.2 { selected from the group 
consisting of SEQ ID Nos. 23, 79, 30 and 80 ), two or more markers located at 21q21 (selected from the group consisting of 
SEQ ID Nos 1. 51. 2. 52. 3 and 53), two or more markers located at 21q21.2 (selected from the group consisting of SEQ ID 
Nos 17, 67, 18, 68, 19, 69. 20, 70, 21, and 71) , two or more markers located at 21q21.3-q22.13 (selected from the group 
consisting of SEQ ID Nos 25. 75. 26. 76, 27, 77, 28, 78, 31, 81. 32, 62. 38, 88, 39. 69. 40. 90, 48, 98, 49, 99. 50. 100. 
22. 72 23, 73. 24, 74> 4, 54, 5, 55. 6, 56. 7. 57, 8, 58, 9, 59. 10, 60. 11, 61, 12, 62, 13. 63. 14. 64, 15, 65, 16, and 66 
). two or more markers located at 21q22.2 (selected from the group consisting of SEQ ID Nos 41, 91, 42, 92, 43, 93. 44. 
94. 45. 95. 46. 96. 47. and 97) , and two or more markers located at 21o22.3 (selected from the group consistino of SEQ 
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ID Nos 33, 83, 34, 84, 35, 85, 36, 86, 37, and 87). Alternatively, the array could comprise amplification primers capable or 
generating an amplification product containing the polymorphic bases of two or more hiallefic markers located at 21q1 1.2 ( 
for example amplification primors capablo of generating an amplification product containing the polymorphic bases of two or 
marc biallelic markers selected from the group consisting of SEQ ID Nos. 29, 70, 30 and 80 ), two or more markers located 
at 21q21 (for example, amplification primers capable of generating an amplification product containing the polymorphic 
bases of two or more biaOefo markers selected from the group consisting of SEQ ID Nos 1, 51. 2. 52. 3 and 53). two or 
more markers located at 21q21.2 {for example, amplification primers capable of generating an amplification product 
containing the polymorphic basos of two or more biatlelic markers soiocted from the group consisting of SEQ 10 Nos 17. 67. 
18, G8, 19, 69, 20, 70, 21, and 71) , two or more markers located at 21q21.3-q22.13 (for example, amplification primors 
capable of generating an amplification product containing the polymorphic bases of two or mora biallelic markers selected 
from the group consisting of SEQ ID Nos 25, 75, 26, 76. 27. 77. 28. 78, 31. 81, 32, 82, 38, 88, 39, 89, 40, 90. 48, 98, 49, 
99, 50, 100, 22, 72, 23. 73, 24, 74, 4, 54, 5. 55. 6, 56, 7, 57, 8, 58, 9, 59, 10, 60, 11, 61. 12, 62. 13, 63, 14. 64. 15. 
65, 16, and 66 ), two or more markers located at 21q22-2 ( for example/ amplification primers capable of generating an 
amplification product containing the polymorphic bases of two or more biallelic markers soiocted from the group consisting 
of SEQ 10 Nos 41. 91. 42. 92. 43, 93. 44. 94, 45, 95. 46. 96. 47. and 97) , and two or more markers located at 21q2Z3 
(for example, amplification primers capable of generating an amplification product containing the polymorphic bases of two 
or more biaHoKc markers selected from the group consisting of SEQ ID Nos 33, 03, 34, 04, 35, 85, 36, 86, 37, and 87). 

In some embodiments, the array may comprise one or more groups of biallePc markers derived from the same BAC 
insert. For example, the array could comprise two or more markers selected from the group consisting of SEQ ID Nos. 29. 
79, 30, and 80 (derived from BAC 1), two or more markers selected from the group consisting of SEQ ID Nos. 1 and 51 
(derived from BAC 2). two or more markers selected from the group consisting of SEQ ID Nos. 2 , 52, 3, and 53 (derived 
from BAC 3). two or mora markers selected from the group consisting of SEQ ID Nos. 17, 67, 18, 68, 19, 69. 20, 70, 21, 
and 71 (derived from BAC 4). two or more markers selected from the group consisting of SEQ ID Nos. 25. 75. 26, 76, 27. 
and 77 {derived from BAC 5). two or more markers sleeted from the group consisting of SEQ ID Nos. 28, 78, 31, 81, 32, and 
82 (derived from BAC 6), two or more markers selected from the group consisting of SEQ ID Nos. 38. 88, 39, 89, 40, and 
90 (derived from BAC 7), two or more markers selected from the group consisting of SEQ ID Nos. 48, 98, 49, 99, 50, and 
100 (derived from BAC 8), two or more markers selected from the group consisting of SEQ ID Nos. 22. 72, 23, 73, 24, and 
74 {derived from BAC 9). two or more markers selected from the group consisting of SEQ ID Nos. 4, 54, 5, 55, 6, 56, 7, 57, 
8. 58. 9. 59. 10, and 60 (derived from BAC 10), two or more markers selected from tho group consisting of SEQ ID Nos. 
11, 61, 12. 62. 13, 63, 14, 64. 15, 65, 16, and 66 (derived from BAC 111, two or more markers selected from the group 
consisting of SEQ ID Nos. 41. 91 # 42. 3Z 43, 93, 44. 94. 45. 95. 46. 96. 47. and 97 (dorivod from BAC 1 2}. or two or more 
markers selected from the group consisting of SEQ ID Nos. 33, 83, 34, 84, 35, 85, 36. 86. 37. and 87 (dorived from BAC 
13). 

Arrays comprising bialfefic markers known to be located in proximity to one another in tho genome permit 
haplotyping analyses to be conducted even when the chromosomal locations of the bialloCc markers has not been 
determined. For example, usino the procedures described above, the alleles of sets of biallelic markers which are orssent in 
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auclefc acid samples from trait positive and trait negative mdividuals may be determined using a succession of arrays, with 
each array having one or more groups of nucleic acids known to be located in proximity to one another thorcon. The 
succession of arrays may comprise biaMc markers spanning the entire genome having any of the average inlnrmarkcr 
distances specified abovo. Alternatively, the succession of arrays need not span the entire genome bat may instead be 
derived from two or more contigated YAC, BAC, or BAC subclone insects. A statistical analysis is performed on the alleles 
of biattolfe markers present in the trait positive and trait negative individuals to identify a haplotype having a statistically 
significant association with Ilia trait. Once a statistically significant haplotype is identified, the genomic locations of the 
hiallefc markers comprising the haplotype may be determined using the methods described herein. In addition, using the 
procedures described herein, the genomic region harboring the biaflelic markers in the statistically significant taping may 
be evaluated to identify the genes associated with the traiL 

Although this invention has boon dascribod in terms of certain preferred embodiments, other embodiments which 
witt he apparent to those of ordinary skill in the art in view of the disclosure heroin are also within the scope of this 
invention. Accordingly, the scope of tho invention is intended to ba defined only by reference to the appended claims. 
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Table 1 



BlalleNc marker 
(Genset code) 


BAG 


Insert size 
(kb) 


average Intermarker 
distance (kb) 


subchromosoma! 
localization 




99-23.78 
99-2381 


1 
1 


150 
150 


75 
75 


21q11.2 
21q11.2 



99-2103 



110 



110 



21q2t 



99-2228 
99-2229 



3 
3 



105 
105 



52.5 
52.5 



21q2l 
2lq21 



99-2312 
99-2315 
99-2320 
99-2321 
99-2324 



4 
4 
4 
4 
4 



130 
130 
130 
130 
130 



26 
26 
26 
26 
26 



21q21.2 
21q21.2 
2lq21.2 
21q21.2 
21q21.2 



99-2362 


5 


100 


33.3 


21q21.3-q22.13 


99-2364 


5 


100 


33.3 


21q21.3-q22.13 


99-2367 


5 


100 


33.3 


21q2l.3-q22.13 



99-2371 


6 


135 


45 


21q22.11-q22.13 


99-2413 


6 


135 


45 


21q22.11-q22.13 


99-2419 


6 


135 


45 


21q22.11-q22.13 



99-2610 


7 


185 


61.7 


2lq22.11-q22.13 


99-2615 


7 


185 


61.7 


21q22.11-q22.13 


99-2620 


7 


185 


61.7 


21q22.11-q22.13 



99-2645 


8 


250 


83.3 


21q22.11-q22.13 


99-2647 


8 


250 


83.3 


21q22.11-q22.13 


99-2649 


8 


250 


83.3 


21a22.1 1-Q22. 13 




99-2333 


9 


140 


46.7 


21q22.11-q22.13 


99-2341 


9 


140 


46.7 


21q22.11-q22.13 


99-2342 


9 


140 


46.7 


21q22.11-q22.13 



99-2240 
99-2242 
99-2244 
99-2246 
99-2248 
99-2250 
99-2251 



10 
10 
10 
10 
10 
10 
10 



95 
95 
95 
95 
95 
95 
95 



13.6 
13.6 
13.6 
13.6 
13.6 
13.6 
13.6 



21q22.11 
21q22.11 
21q22.11 
21q22.11 
21q22.11 
21q22.11 
21q22.11 



-q22.13 
q22.13 
-q22.13 
-q22.13 
-q22.13 
-q22.13 
-q22.13 



99-2269 
99-2271 
99-2272 
99-2273 
99-2275 
99-2278 



40 
40 
40 
40 
40 
40 



6.7 
6.7 
6.7 
6.7 
6.7 
6.7 



21q22.11-q22.13 
21q22.11-q22.13 
21q22.11«q22.13 
21q22.11-q22.13 
21q22.11-q22.13 
21q22.11-q22.13 



99-2624 
99-2625 
99-2630 
99-2633 
99-2634 
99-2637 
99-2642 



12 
12 
12 
12 
12 
12 
12 



165 
165 
165 
165 
165 
165 
165 



23.6 
23.6 
23.6 
23.6 
23.6 
23.6 
23.6 



21q22.2 
21q22.2 
21q22.2 
21q22.2 
21q22.2 
21q22.2 
21q22.2 



99-2559 
99-2566 
99-2567 
99-2570 
99-2571 



13 
13 
13 
13 
13 



205 
205 
205 
205 
205 



41 
41 
41 
41 
41 



21q22.3 
21q22.3 
21q22.3 
21q22.3 
21q22.3 
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WHAT IS CLAIMED IS: 

1. A method of obtaining a set of biallelic markers comprising the stops of: 

obtaining a nucleic acid library comprising a plurality of genomic DNA fragments comprising tho full genuine or 
5 a portion thereof; 

dotermining the order of said plurality of rjsnomic DNA fragments in (lie gonomo; 
determining the sequence of selected regions of said plurality of genomic DNA fragments; and 
identifying nucleotides in said plurality of genomic DNA fragments which vary between individuals, thereby 
defining a set of biallelic markers. 

10 2 ' The mcthod o{ Claim 1, wherein said identifying step comprises identifying about 20.000 biallclic 
markers. 

3. The method of Claim 1, wherein said identifying step comprises identifying about 40,000 biallelic 

markers. 

4. Trie method of Claim 1, wherein said identifying step comprises identifying about 60 f 000 biallclic 

15 markers. 

5. The method or Claim 1, wherein said identifying step comprises identifying about 80,000 biallelic 

markers. 

6. The method of Claim 1. wherein said identifying step comprises identifying about 100 r 000 biallelic 

markers. 

20 7 * 11,15 method °* Claim \, wherein said identifying step comprises identifying about 120,000 biallclic 
markers. 

8. The method of Claim l r wherein said biallelic markers arc separated from one another by an average 
distance of 10kb-200 kb. 

9. The method of Claim 1. wherein said biallelic markers aro separated from one another by an average 
25 distance of 15kb-l50 kb. 

ia The method of Clam 1, wherein said biallelic markers are separated from one another by an average 
distance of 2Qkh-1QD kb. 

11. The method of Clam 1, wherein said bialleiic markers are separated from one another by an avorage 
distance of lOOkb-1 50 kb. 

30 1i 1110 method' of Claim 1, wherein said biallelic markers are separated from one another by an average 

distance of 50-1 aQkb. 

13. The method of Claim 1, wherein said bialleHc markers are separated from one another by an avorago 
distance of 25 kb-50 kb. 

14. The method of Claim 1, wherein the step of determining the sequence of selected regions of said 
35 plurality of genomic DNA fragments comprises inserting fragments of said plurality of genomic DNA fragments into a 
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vector to generate a plurality of subclones and determining the sequonce of a region of the inserts in said plurality of 
subclones or a subset thereof, 

15. The method of Claim 14, wheroin said step of determining the sequence of a region of said inserts or 
a subsot thereof comprises determining the sequence of one or bath end regions of said inserts or a subset thereof. 

IB. The method of Claim 14, wherein the step of determining the soquence of one or both end regions of 
said plurality of subclones comprises determining the sequence of about 500 bases at each end of said subclones or a 
subsot thereof. 

1 7. The method of Claim 1, wherein a set of about 1 0.000 to about 20,000 genomic ONA inserts with 
an average size between 1Q0kb and 300kb are ordered. 

18. The method of Claim 1. wherein a set of about 10,000 to about 30.000 genomic 0NA inserts with 
an average size between lOOIcb and 150 kb are ordered. 

19. The method of Claim 1, wherein a set of about 15.000 to about 25,000 genomic DNA inserts with 
an average size between 100kb and 200 kb are ordered. 

20. The method of Claim 1. wherein said identifying step comprises identifying between 1 and 6 bialCeltc 
markers per genomic ONA fragment. 

21. The method of Claim 1, whorein said identifying step comprises identifying an average of 3 bfoHelic 
markers per genomic ONA insert. 

22. The method of Claim 1, wherein said genomic DNA fragments are in a Bacterial Artificial 
Chromosome. 

23. The method of Claim 1, wherein said genomic DNA fragments are in a Yeast Artificial Chromosome. 

24. The method of Claim 1. further comprising determining the position of said biellelic markers along the 
genome or a portion thereof. 

25. The method of Claim 24, wherein the step of determining the position of said biaUefic markers along 
the genome or portion thereof comprises determining the position of said biallelic markers along a chromosome. 

26. The method of Claim 24. wherein the step of determining the position of said biallofic markers along 
the genome or portion thereof comprises deteimining the position of said biallelic markers along a subchromasomal 
region. 

27. The method of Claim 1, further comprising identifying biallelic markers which are in linkage 
disequilibrium with one another. 

28. The method of Claim 27, further comprising obtaining pluralities of biallelic markers such that oach 
marker is in linkage disequilibrium with at least one identifie markers. 

29. The method of Claim t, wherein said portion of the genome comprises at least 200 kb of contiguous 
genomic DNA. 

30. The method of Claim 1, wherein said portion of the genome comprises at least 300 kb of contiguous 
genomic ONA. 
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31- The method of Claim 1, wherein said portion of the genome comprises at least 500 kb of contiguous 
genomic DMA. 

32. The method of Claim 1, wherein said portion of the genome comprises at least 2 Mb of contiguous 
genomic DNA. 

5 33. The method of Claim 1, wherein said portion of the genome comprises at least 5 Mb of contiguous 

genomic DMA. 

34. The method of Claim 1, wherein said portion of the genome comprises at least 10 Mb of contiguous 
genomic DNA. 

35. The method of Claim J. wherein said portion of the genome comprises at least 20 Mb of contiguous 
10 genomic DNA. 

36. The method of Cfaim 1. further comprising the step of identifying one or more groups of biallelic 
markers which are in proximity to one another in the genome. 

37. The method of Claim 36 r wherein the btalfelic markers in each of these groups are located within a 
genomic region spanning Jess than Ikb. 

15 38 ^ method of Claim 36, wherein the biallelic markers in each of these groups are located within a 

genomic region spanning from 1 to 5kb. 

39 The method of Claim 36, wherein the biallelic markers in each of these groups are located within a 
genomic region spanning from 5 to IQkh. 

40 The method of Claim 36, wherein the biallelic markers in each of these groups are located within a 
20 genomic region spanning from 10 to 25kb. 

41 The method of Claim 36, wherein the biallelic markers in each of these groups are located within a 
genomic region spanning from 25 to 50kk. 

42 The method of Claim 36. wherein the biallelic markers in each of these groups are located within a 
genomic region spanning from 50 to 1 SOkb. 

25 43 1118 method of Claim 36, wherein the biallelic markers in each of these groups are located within a 

genomic region spanning from 150 to 250kb. 

44 The method of Claim 36, wherein the biallelic markers in each of these groups are located within a 
genomic region spanning from 250 to SOOkb. 

45 The method of Claim 36, wherein the biallelic markers in each of these groups are located within a 
30 genomic region spanning from SOOkb to 1Mb. 

46 The method of Claim 36, wherein the biallelic markers in each of these groups are located within e 
genomic region spanning more than 1Mb, 

47. A method of obtaining a set of biallelic markers comprising the steps of: 

obtaining a nucleic acid library comprising genomic DNA fragments comprising the full genome or a portion 

35 thereof; 

determining the sequence of selected regions of said genomic DMA fragments: 
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identifying nucleotides in said genamie DNA fragments which V3ry botween individuals, thereby defining a set 
of biallelic markers; and 

determining the order of said biallelic markers along the genome or portion thereof. 

48 A set of biallelic markers obtained by tho method of Claim 1. 

49 The set of biaUelic markers of Claim 48, wherein the markers in said set have a known genomic 

position. 

50 Tho set of biallelic markers of Claim 40, whoroin the arc ordered rclativo to one another. 

51. A set of biallelic markers having a known relationship to one another and a known genomic position, 
said set of biallcfic markers being obtained by the method of Claim 1. 

52. The set of biallelic markers of Claim 48, wlieroin said biallelic markers h3ve hotorozynosity rales of 
at least about 0.18. 

53. The set of biallelic markers of Claim 48, wherein said biallelic markers have heterozygosity rote of at 
least about 0.32. 

54. The set of biallelic markers of Claim 48, wherein said biallolic markers have a heterozygosity rate of 
at least about 0.42. 

55. A map comprising an ordored array of at least 20.000 biallelic markers obtained by the method of 

Claim 1. 

5B. The map of Claim 55, comprising an ordered array of at least 60,000 biallelic markers obtained by 
the method of Claim 1. 

57. The map of Claim 55 comprising an ordered 3rray of at least 100.000 biallelic markers obtained by 
the method of Claim 1. 

58. The map of Claim 55. wherein said biallelic markers are distributed at an average marker density of 
one marker every 150kb. 

59. The map of Claim 56, whorein said biallelic markers are distributed at an average marker donsity of 
one marker evory 50 kb. 

60. The map of Claim 57, wherein said biallelic markers are distributed at an average marker density of 
one marker every 25 kb. 

61. A method of identifying one or more biallolic markers associated with a detectable trait comprising 
the steps of; 

determining tho frequencies of each allele of one or more bialolic markers obtained by the method of Claim 1 
in individuals who express said detectable trait and individuals who do not express said detectablo trait: and 

identifying one or more alleles of said one or more biallolic markers which aro statistically associated with the 
expression of said detectable trait. 

62. The method of Claim 61, wherein said detectable trait is selected from the group consisting of 
disease, drug response, drug efficacy, and drug toxicity. 
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63. The method of Claim 61 r wherein the phenotypo of said individuals who express S3id detectable trait 
and the phenotypo of said individuals who do not express said detectable trait are readily distinguishable from one 
another. 

64. The method of Claim Bl, wherein the individuals who express said dotectable trait and the individuals 
who do not cxpross said detectable trait are solacted from a banadal phenotypu distribution. 

65. The method of Claim Bl. wherein said individuals who express said detectable trait arc at one 
phenotypic extreme of the population and said individuals who do not express said detectable trait are at the uther 
phenotypic extreme of the population, 

66. A method of identifying a haplotypc associated with a trait comprising the steps of: 
obtaining nucleic acid samples from trait positive and trait negative individuals; 

determining the frequencies of the alleles of each momber of a group of biallclic markers obtained by the 
method of Claim 1 known to be located proximity to one another in the genome tit said nucleic acid samples; and 

identifying a plurality of alleles of biallelic markers having a statistically significant association with said trait. 

67. The method of Claim 66* wherein said dotectable trait is selected from the group consisting of 
disease, drug response, drug efficacy, and drug toxicity. 

68. The method of Claim 66, wherein the biallelic markers in each of those groups ore located within a genomic 
region spanning less than Ikb. 

69 The method of Claim 66, wherein the biaHciic markers in oach of these groups arc located within a 
genomic region spanning from 1 to 5kb. 

70 The method of Claim 66, wherein the biallolic markers in each of these groups are located within a 
genomic region spanning from 5 to 10kb. 

71 The method of Claim 66, wherein the biallclic markers in oach of these groups are located within a 
genomic region spanning from 10 to 25kb. 

72 The method of Claim 66, wherein the biallelic markers in each of these groups are located within a 
genomic region spanning from 25 to 50kb. 

73 The method of Claim 66, wherein the biallelic markers in each of theso groups ere located within a 
genomic region spanning from 50 to 150kb. 

74 The method of Claim 66, wherein the biallelic markers in each of these groups are located within a 
genomic region spanning from 150 to 250kb. 

75 The method of Claim 66, wherein the biallelic markers in each of these groups are located within a 
genomic ragion spanning from 250 to 5Q0ko. 

76 Tho method of Claim 68, wherein the biallelic markers in each of these groups are located within a 
genomic region spanning from 500kb to 1Mb. 

77 The method of Claim 66, wherein the biallelic markers in each of these groups are located within a 
genomic region spanning more than 1Mb. 
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78. A mothod of identifying one or more biaflefic markers associated with a detectable trait comprising 
tho stops of : 

selecting a gene in which mutations result in a detectable trait or a gone suspocted of being associated with a 
detectable trait; and 

identifying one or more biallelic markers obtained by the method of Claim 1 within the genomic region 
harboring said gene which are associated with said detectable trait. 

79. The mothod of Claim 78, whoroin said detectable trait is selected from tho group consisting of 
disease, drug response, drug officacy, and drug toxicity. 

80. The method of Claim 78, wherein said identifying stop comprises: 

determining the frequencies of said one or more biallelic markers in individuals who express said detectable 
trait and individuals who do not express said detectablo trait; and 

identifying one or more biallelic markers which are statistically associated with the expression of said 
detectable trait. 

81. An array of nucleic acids fixed to a support, said nucleic acids comprising at least 8 consecutive 
nucleotides, including the polymorphic nucleotide, of one or moro biallelic markers obtained by the method of Claim 1. 

82. The array of Claim 81, wherein said nucleic acids comprise at least 8 consecutive nucleotides, 
including the polymorphic nucleotide, of at least five biallelic markers obtained by the method of Claim 1. 

83. The array of Claim 61. wherein said nucleic acids comprise at least 8 consccutivo nucleotides, 
including the polymorphic nucleotide, of at least ten biallelic markers ohtained by the method of Claim 1. 

84. An array of nucleic 8cids fixed to a support, said nucleic acids comprising at least 8 consecutive 
nucleotides, including the polymorphic nucleotide, of one or more groups of biallelic markers known to be located in 
proximity to one another in the genome. 

85. An array of nucleic acids fixed to a support, said nucleic acids comprising amplification primers for 
generating an amplification product comprising at toast 8 consecutive nucleotides, including the polymorphic nucleotide, 
of one or more biallelic markers obtained by the method of Claim 1 . 

86. An array of nucleic acids fixed to a support, said nucleic acids of comprising amplification primers for 
generating an amplification product comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, 
of one or more groups of biaBelic markers known to be located in proximity to one another in the genome. 

87. An array of nucleic acids fixed to a support, said nucleic acids comprising one or more 
tnicrosequencing primers for determining tho identity of the polymorphic base of one or more nucleic acids comprising at 
least 8 consecutive nucleotides, including tho polymorphic nucleotide, of ono or more biallelic markers obtained by the 
method of Claim 1. 

88. An array of nucleic acids fixed to a support, said nucleic nucleic acids comprising one or more 
microsequencing primers for determining the identity of the polymorphic bases of one or more groups of biallelic markers 
known to be located in proximity to one another in the genome. 
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89. An array of nucleic acids fixed to a support, whoroin said nucleic acids ars complementary to one or 
more mrcrosequencing primers for determining the identities of the polymorphic bases of one or mors biallelic markers 
obtained by the method of Claim 1. 

90. The array of Claim 83, wherein said nucleic acids are complementary to at least five microsequencing 
primers for doterimiriing the identities of the polymorphic bases of at least five braffblic markers obtained by the mothod 
of Claim 1. 

91. The array of Claim 89. wherein said nucleic acids are complementary to at least ton microsequencing 
primers for determining the identities of the polymorphic bases of at (east ten biallolic markers obtained by the method 
of Claim 1. 

92. An array of nucleic acids fixed tu a support, said nucleic acids comprising one or more nucleic acids 
complementary to one or more microsequencing primers for determining the identity of the polymorphic bases ol one or 
more groups of biallelic markers known to bo located in proximity to one another in the genome. 

93. The array of any one of Claims 84. 86. 86, and 92, wherein the members of each of said ono or more 
groups of biallelic markers are located in physical proximity to one another on said support . 

94. The array of any one of Claims 84, 86, 88, and 92, wherein said biallelic markers in each of those 
groups are located within a genomic region spanning less than 1 kb. 

35 The array of any one of Claims 84, 86, 88, and 92, wherein said biallelic markers in each of these 
groups are located within a genomic region spanning from 1 to 5kh. 

96 The array of any one of Claims 84, 86, 8B, and 92. wheroin tho biallelic markers in each of these 
groups are locatod within a genomic region spanning from 5 to lOkb. 

97 The array of any one of Claims 84, 86, 88, and 92. wherein the biallelic markers in each of these 
groups ere located within a genomic region spanning from 10 to 25kb. 

98 The array of any one of Claims 84, 86. 88, and 92, wherein tho biallelic markers in each of these 
groups are located within a genomic region spanning from 25 to 50kb. 

99 The array of any one of Claims 84, 86, 88. and 92. wherein the biallelic markers in each of these 
groups are located within a Genomic region spanning from 50 to ISOkb. 

100 Tho array of any one of Claims 84, 86, 88, and 92, wherein the biallelic markors in each of these 
groups are located within a genomic region spanning from 1 50 to 250kb. 

101 The array of any one of Claims 84, 86, 88, and 92, wherein the biallelic markers in oach of these 
groups are located within a genomic region spanning from 250 to 500kb. 

t02 The array of any ono of Claims 84, 86, BB, and 92, whorein the biallelic markers in each of thess 
groups are located within a genomic region spanning from 500kb to 1Mb. 

103 The array of any one of Claims 84, 88, 88, and 92, wherein the biallelic markers in oach of these 
groups are located within a genomic region spanning mora than 1Mb. 

104, The array of any one of Claims 84, 86, 88, and 92, wherein each group of biallelic markers • 
comprises at least 3 biallelic markers. 
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105. The array of any one of Claims 84. 86, 88. and 92, wherein each group of biailelic markers 
comprises at least 6 biailelic markers. 

106. The array of any one of Claims 84, BG, 80, 3nd 92, whorein e3ch group of biailelic markers 
comprises at least 20 biailelic markers. 

107. A method for determining whothor an individual is at risk of developing a detectable trait or suffers 
from a detectable trait associated with said trait comprising the stops of: 

obtaining a nucleic acid sample from said individual; 

screening said nucleic acid sample with one or more hiatlelic markers obtained by tho method of Claim 1; and 
determining whether said nucleic acid sample contains one or more of biailelic markors statistically 
associated with said detectable trait. 

108. The method of Claim 107. wherein said detectable trait is seloctod from the group consisting of 
disease, drug response, drug efficacy and drug toxicity. 

109. The mothod of Claim 107, wherein said biailelic markers wcro obtained by the method of Claim 61. 

110. The method of Claim 107, wherein said biailelic markers were obtained by the method of Claim 78. 

111. A method of using a drug comprising: 
obtaining a nucleic acid sample from an individual; 

determining the identity of tho polymorphic base of one or more biailelic markers obtained by the method of 
Claim 1 which is associated with a positive response to treatment with said drug or one or more biailelic markers 
obtained by the method of Claim 1 which is associated with a negative response to treatment with said drug; and 

administering said drug to said individual if said nucleic acid sample contains one or more biailelic markers 
associated with a positive response to treatment with said drug or if said nucleic acid sample lacks one or more biailelic 
markers associated with a negative response to said drug. 

112. The method of Claim 111, wherein said determining step comprises determining the identity of the 
polymorphic base of one or more biailelic markers obtained by the method of Claim 62 which is associated with a 
positive response to treatment with said drug or ono or more biailelic markers obtained by the method of Claim 62 which 
is associated with a negative response to treatment with said drug. 

113. The method of Claim 111, wherein said determining step comprises determining the identity of tho 
polymorphic base of one or more biailelic markers obtained by the method of Claim 79 which is associated with a 
positive response to treatment with said drug or one or more biailelic markers obtained by the method of Claim 79 which 
is associated with a negative response to treatment with said drug. 

114. A mothod of selecting an individual for inclusion in a clinical trial of a drug comprising: 
obtaining a nucleic acid sample from an individual; 

determining the identity of the polymorphic base of one or more biailelic markers obtained by the method of 
Claim 1 which is associated with a positive response to treatment with said drug or one or more biailelic markers 
associated with a negative response to treatment with said drug in said nucleic acid sample; and 



WO 99/04038 



-97- 



PCT/IB98/01193 



including said individual in said clinical trial if said nucleic acid sample contains one or more biallelic markers 
obtained by the method of Claim 1 which is associated with a positive response to treatment with said drug or if said 
nucleic acid sample lacks one or more biallelic markers associated with a negative response to said drug. 

115. The method of Claim 114, wherein said determining step comprises determining tho identity of the 
5 polymorphic base of one or more biallelic markers obtained by the method of Claim 62 which is associated with a 

positive response to treatment with said drurj or one or more biallelic markers obtained by the method of Claim G2 which 
is associated with a negative response to treatment with said drug. 

11 G. The method of Claim 1 14, wherein said determining step comprises determining the identity of the 
polymorphic base of one or more biallelic markers obtained by the method of Claim 79 which is assoclatod with a 
10 positive response to treatment with said drug or one or mora biallelic markors obtained by the method of Claim 79 which 

is associated with a negative response to treatment with said drug. 

117. A method of identifying a gene associated with a detectable trait comprising the steps of: 
determining the frequency of each allele of one or more biallelic markers obtained by the method of Claim 1 in 

individuals having said detectable trait and individuals lacking said detectable trait; 
15 identifying one or more alleles of one or more biallelic markors having a statistically significant association 

with said detectable trait; and 

identifying a gene in linkage disequilibrium with said ono or more alleles. 

118. Th8 method of Claim 78, further comprising identifying a mutation in the gene which is associated 
with said detectable trait. 

20 119. The method of Claim 78, wherein said detectable trait is selected from the group consisting of 

disease, drug response, drug efficacy, and drug toxicity. 

120. A method of identifying a gene associated with a detectable trait comprising: 
selecting a geno suspected of being associated with a detectable trait; and 

identifying one or more biallelic markers obtained by the method of Claim 1 within the genomic region 
25 harboring said gene which aro associated with said detectable trait. 

121. The method of Claim 120, wherein said detectable trait is selected from the group consisting of 
disease, drug response, drug ef ficacy, and drug toxicity. 

122. The method of Claim 120, whoroin said identifying step comprises: 

determining the frequencies of said one or more biallelic markers in individuals who express said detectable 
30 trait and individuals who do not express said detectable trait; and 

identifying one or more biallelic markers which ere statistically associated with the expression of said 
detectable trait. 

123. A method of identifying a haplotypo associated whh a trait comprising tha steps of: 
obtaining nucleic acid samples from trait positive and trait negative individuals: 

35 conducting an amplification reaction on said nucleic acid samples using amplification primers capable of 

generating amplification products containing the polymorphic bases of a plurality of biallelic markers; 
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contacting one or more arrays according to Claim 84 with said amplification products; 
determining the identities of the polymorphic bases of said amplification products: and 
identifying a haplotype having a statistically significant association with said trait. 

124. A method of identifying a haplotype associated with a trait coinprsing the steps of: 
obtaining nucleic acid samples from trait positive and trait negative individuals; 

conducting amplification reactions on said nucleic acid samples using amplification primors capable of 
generating amplification products containing the polymorphic basos of a plurality of biallcfic markers; 

contacting one or more arrays according to Claim 88 with said amplification products; 

conducting microsequoncing reactions on said amplification products using microsequencing primurs on said 
arrays, thereby generating elongated microsequencing primers comprising the polymorphic bases of said amplification 
products; 

determining the identities of said polymorphic bases; and 

identifying a haplotype having a statistically significant association with said trait. 

1 25. A method of identifying a haplotype associated with a trait comprising the steps of: 
obtaining nudeic acid samples from trait positive and trait negative individuals; 

conducting amplification reactions on said nucleic acid samples uisng amplification primors which arc capable 
of generating amplification products containing the polymorphic bases of a plurality of biallolic markors; 

conducting microsoquencing reactions on said nucleic acid samples, thereby generating microsequencing 
products containing the polymorphic bases of one or more biallefic markers at their 3* ends, said polymorphic bases being 
detectably labeled; 

contacting one or more arrays according to Claim 92 with said microsequencing products such that said 
microsequencing products specifically hybridize to said nucleic acids complementory to said microsequencino primers; 
determining the identities of the polymorphic bases of said microsequencing products; and 
identifying a haplotype having a statistically significant association with said trait. 

126. A method of identifying a haplotype associated with a trait comprising the steps of: 
obtaining nocleic acid samples from trait positive and trait negative individuals; 
contacting one or more arrays according to Claim 86 with said nucleic acid sample; 

conducting an amplification reaction on said nucleic acid samples using amplification primers on said array 
which 8re capable of generating amplification products containing the polymorphic bases of a plurality of biallelic 
markers; 

determining the idonttties of the polymorphic basos of said amplification products; and 
identifying a haplotype having a statistically significant association with said trait. 

127. A method of determining whether an individual is at risk of developing Alzheimer's disease or whether 
the individual suffers from Alzheimer's disease as a result of possessing the Apo E €4 Site A allele comprising: 
obtaining a nucleic acid sample from said individual: and 
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determining the identity of the polymorphic base in one or mare of tlie sequences sclectod from the group 
consisting of SEQ (0 No*. 301-305 and SEQ ID Nos. 307-31 1 or the sequences complementary theroto in said nucleic 
add sample 

123. The method of Claim 127. further comprising determining whothor said nucleic acid sample contains 
the sequence of SEQ ID No. 306 or the sequence complementary thereto. 

129. The method of Claim 127, wherein said step of determining the identity of the polymorphic bases in 
one or more of the sequences selected from the g"mp consisting of SEQ 10 Nos. 301-305 and SEO ID Nos. 307-31 1 or 
the sequences complementary thereto comprises determining whether said nucleic acid sample contains the sequence of 
SEQ 10 NO: 31 1 or the sequence complementary thereto. 

130. The method of Claim 129. further comprising determining whether said nucleic acid sample contains 
the sequence of SEQ ID No. 306 or the sequence complementary thereto. 

131. An isolated nucleic acid comprising a sequence selected from the group consisting of SEQ ID No. 
301, SEQ ID No. 307, the sequences complementary thereto, and fragments comprising at least 8 ennsecutivo 
nucleotides, including the polymorphic nucleotide, thereof. 

132. An isolated nucleic acid comprising a sequence selected from the group consisting of SEQ ID No. 302 
, SEQ ID No. 308, the sequences complementary thereto, and fragments comprising at feast 8 consecutive nucleotides 
thereof. 

133. An isolated nucleic acid comprising a sequence selected from the group consisting of SEO ID No. 

303, SEQ ID No. 309, the sequencos complementary thereto, and fragments comprising at least 8 consecutive 
nucleotides, including the polymorphic nucleotide, thereof. 

134. An isolated nudeic acid comprising a sequence selected from the group consisting of SEQ ID No. 

304, SEQ ID No. 310 , the sequences complementary thereto, and fragments comprising at least 8 consecutive 
nucleotides, including the polymorphic nucleotide, thereof. 

135. An isolated nucleic acid comprising a sequence selected from the group consisting of SEQ ID No. 

305, SEQ ID No. 311, the sequences complementary thereto, and fragments comprising at least 8 consecutive 
nucleotides, including the polymorphic nucleotide, thereof. 

136. An isolated nucleic acid comprising a sequence selected from the group consisting of SEQ ID Nos. 
313*317, SEQ ID Nos. 319*323, and fragments comprising at least 8 consecutive nucleotides thoreof. 

137. An isolated nucleic acid comprising a sequence selected from the group consisting of SEQ 10 Nos. 
325-329, SEQ ID Nos. 331-335, the sequence complementary thereto, and fragments comprising at least 8 consecutive 
nucleotides thoreof. 

138. An set of nucioic acids comprising at least 8 consecutive nucleotides, including the polymorphic 
midectido, of one or more biallelic markers obtained by the method of Claim 1. 

139. A set of nuclotc acids comprising amplification primers for generating an amplification product 
comprising at least 8 consecutivo nucleotides, indudmg the polymorphic nucleotide, of one or more biallelic markers 
obtained by the method of Claim 1. 



_9904O38A£j_> 



WO 99/04038 



-100- 



PCT/IB98/01193 



140. A set of nucleic acids comprising one or moro microsequencing primers for determining the idontity of 
the polymorphic base of one or more nodeic udds comprising ot least 8 consecutive nucleotides, including the 
polymorphic nucleotide, of one or more biaflolic markers obtained by tins method of Claim 1. 
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Figure 1 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME; GENSET SA 

(B) STREET: 24, RUE ROYALE 

(C) CITY: PARIS 

(E) COUNTRY: FRANCE 

(F) POSTAL CODE (ZIP): 75008 



(ii) TITLE OF INVENTION: Biallelic markers for use in constructing a 
high density disequilibrium map of the human genome. 

(iii) NUMBER OF SEQUENCES: 336 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy Disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: Win95 

(D) SOFTWARE: Word 

(2) INFORMATION FOR SEQ ID NO: 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
<C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2103-270 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2103-270 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2103-270 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
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CTTGGATTCA TATGAGACAG CTAGCAGACC TTCAATTTTT CTACACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2228-301 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2228-301 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/ KEY : Potential microsequencing oligo 99-2228-301 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CCCTGCTTAT CCCTGTAAGG TGGAGACCCA TATGGGCAAG GCCAGAC 4 7 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA . 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2229-240 

(B) LOCATION: 1..47 
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(ix) FEATURE: 

(A) NAME/ KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oiigo 99-2229-240 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



TCGTCATCGT GGCCTGGGCT ACAGACTACC TGTTCCAGTC CTTCCAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2240-281 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2240-281 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2240-281 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



GCAATCTTAA TAACTTTTTA TTTCAGTAAT TCGAATCTTT TTTTTCT 47 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2242-206 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2242-206 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2242-206 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GTGTTTTCTT T TAG T C AAAT TATCTTATAT TTTACTTTTT TCTTAAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2244-83 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 
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(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2244-83 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oiigo 99-2244-83 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



TAATTGTAGA TACTAAGACC ATTATGCTTA AACCATGTAG GTACTGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2246-340 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2246-340 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2246-340 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



ATTTATATGT TAAATGCAGA GAAAAAGAAA AATAAGTTTT GCAGTAA 47 



(2) INFORMATION FOR SEQ ID NO: 8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2248-76 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2248-76 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2248-76 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



GACAGAGAGG GAAGGTAATC TTCCCCTGAA GTCTGCCCAT CCCCTGG 4 7 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2250-236 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: complement 25.-47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



ATGTATCCAA AACAGAATTA ACACACTTTG GGTTTTTTAT TTTTATT 47 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2251-151 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

( ix ) FEATURE : 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 

(B) LOCATION: 1..23 

(ix) c FEATURE: 

, (A) NAME/KEY: Potential microsequencing oligo 99-2251-151 
(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TGAAAAGAAG TTCAGACGAT TGCAGATAGA CTAGTTTGGC TGTTGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: . 

(A) NAME/KEY: polymorphic fragment 99-2269-179 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequericing oligo 99-2269-179 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2269-179 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



AAAATAAAGA AATTCCTAGA GACATACAGC CTATCAAGAT CAAACCA 4 7 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs "~ 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2271-403 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2271-403 

(B) LOCATION: 1 . . 23 



9904038A2J_> 



WO 99/04038 PGT/IB98/01 1 93 

9 



(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2271-403 
<B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



AGGCATTTAT TTCATATTTA TTAACCTTGA TTTTCTTATC TTCAAGT 4 7 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2272-4 09 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2272-409 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2272-409 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



AAAAGCACTG CAATTATTTT GGAGACTGTG AAATATTGCA AGTTTTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



9904038A2J_> 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

( i x J FEATURE : 

(A) NAME/KEY: polymorphic fragment 99-2273-528 
(D) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(i;<) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2273-528 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2273-528 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



ACTTGAAGAT AAGAAAATCA AGGCTAATAA ATATGAAATA AATGCCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2275-466 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-4 66 
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(B) LOCATION: complement 25.. 47 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TTGATGATAG CATTAAATAC T CCCAAAAAC TGTGAATAGG GATACTA 47 



(2) INFORMATION FOR SEQ ID NO: 16: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2278-276 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/ KEY : Potential microsequencing oligo 99-2278-276 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing ^ol igo 99-2278-276 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



GAAAAAAATG GGAACATCTT CACAGCCTGT GCATCTCCAA CAAGATT 4 7 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 

fix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2312-358 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2312-358 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2312-358 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TTGAAGAGAG AGATGGAAAA AAACGTAGGC CTTCTGGGTA AATGGCC 4 7 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2315-213 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2315-213 

(B) LOCATION: complement 25 .. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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AGATGGATTC TACCCACAGG CAAAAGAAAA CCTTATTTTA AAAATAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2320-292 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
ACTCTCATTC ACTAAACTTC AACCGTTTTT ATAAATTTAA TGAATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME /KEY : polymorphic fragment 99-2321-82 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2321-82 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2321-82 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



TAAAGCTTAC TCAGTGTCCA CTCCGGATAC CTACTCAAAT ATTTCCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : . SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2324-338 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2324-338 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2324-338 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
AGATAGAAGA CAAAATCGCA GGAAAAGAAA TCCCTCAACA GTAAAAA 4 7 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2333-423 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2333-423 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2333-423 

(B) LOCATION: complement 25 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



GAGACGCTAT CTATGCAAGG AGGGTGTTCA ACATTTGGAC AGCCACG 4 7 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2341-485 

(B) LOCATION: 1..47 
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(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2341-4 85 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2341-485 

(B) LOCATION: complement 25., 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



ACACATCTGT CTGTTACCTA CACCTTACAA AGAATCGCAC AGGCTCT 47 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2342-217 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 



(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2342-217 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2342-217 

(B) LOCATION: complement 25.. 47 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



TAGAGCCTTG GACTTTCATG AC AC T T CT AG AAACAGCCCA GATTGTG 



47 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
<C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2362-270 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TCTCTCTTGG GTGGTTCCTC AACATGTGTG ACCTTGACCA AGTATTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2364-329 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 
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(D) OTHER INFORMATION: base g 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2361-329 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2361-329 

(B) LOCATION: complement 25.. 47 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
ATATAAAATG ATGAACCATA TACGTGAGGC AAGGTAACAT ATAATTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2367-61 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2367-61 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2367-61 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TAAACATTTC ATTATTTCAG AAAATAATAT GCATTTTCAC CAACACA 4 7 



(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2371-93 
(D) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2371-93 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2371-93 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CTCTAAACTT TCCTAATACT TACATCACTG CCTACTTTTT ACATAAT 4 7 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2378-200 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 



(ix) 



FEATURE: 
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(A) NAME /KEY : Potential microsequencing oligo 99-2378-200 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2378-200 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



GAGAACTTCC TGTTGAACCT GTTATAGAAC TGTCCTGTCG TCCAAGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2381-394 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2381-394 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



AGTGGTCTTC AGGTTATTGG TAGAGAAAAG TAGGGGAGCT AAAGGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2413-368 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

{ B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-24 13-368 

(B) LOCATION: 1 . . 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2413-368 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



ATTTTAAGAG GAAAACTTAA TGGAAGAATT GTACATAATA TTTCATT 4 7 



(2) INFORMATION FOR SEQ ID NO: 32: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2419-285 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2419-285 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



AAGGGATCAA GCAGTGCCCA CTCCCCACCC TCCAGGGAGC TGTGACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(3) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2559-253 
(3) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(3) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 
(3) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2559-253 
(3) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



CAGGTGTTTT CATGCCCTCT TAGGGTGTGT CACATCATCC ATCTCAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



9904C38A2J_> 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2566-112 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2566-112 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2566-112 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



GCCTTCACAA CCGCAGAGGC AAGAGAAGGA GCTTGGCCAC CCTGACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-25 67-329 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2567-329 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/ KEY : Potential microsequencing oligo 99-2567-329 

(B) LOCATION: complement 25.. 47 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



CACTGTCAGA TATGAAATGA TGCGTGGCTT TCTTTGGGCT ATATTTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2570-218 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: complement 25.-47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



GGAAAGTTCC AAATTATGAG AAGCGAGGCC TCTGAAGTGG CTAAGTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2571-242 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2571-242 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2571-242 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



ATAATGAATG AGTATTTGAT ATTATATAAT TAAATGTGTC AGCATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2610-121 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2610-121 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2610-121 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
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ATACCCCTTC CCTAGGTATG GCTATATGCT GCACTTAGAA AATTCTC 4 7 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2615-83 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

<B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/ KEY : Potential microsequencing oligo 99-2615-83 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2615-83 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



AACAAATCAC AAGTTGGCAA AAGCAGCAAA TTCTCATCTT CTGGGAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2620-227 
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(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2620-227 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2620-227 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



TTGACTGGGC TCCTGATGTG TCCAGGGTAT CTTGCTGGCT GTTTTGC 4 7 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: .44 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2624-407 

(B) LOCATION: 1..44 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2624-407 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2624-407 

(B) LOCATION: complement 25:, 44 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 



ATCTGGCCAT 



AGGCAGAACA TTGGGGGAGA GATGGGGAAA GAGA 



44 
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(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNES5 : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2625-70 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2625-70 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2625-70 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



AGTGACTCAA CCAGAAAGAG AGCAGGAGAG AGGACGAAGA GAGGAGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2630-67 

(B) LOCATION: 1. .47 

(ix) FEATURE: 



9904O38A2 I > 
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(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2630-67 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2630-67 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
TAAATTCTGC CTAGAAGATT AAGATTGGTC CAGAACAGGG AGTGTTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2633-129 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2633-129 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2633-129 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
TAGCTATTTC TTCCCCTAGG CAAAGTAGAC AATGAGAGAA CCCTTGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 45: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

Ui) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2G34-341 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2634-34 1 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2634-341 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



GGAATCAATA TTTATTTATT ATCAACAGGT GAGACATTAT TTATTTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 4 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2637-28 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 
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(ixj FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2637-28 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2637-28 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 6: 



CCATCACTTC CTCCTAGTGA AAAATCAAAG GAGGGTGGGT TTTATAG 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
<D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2642-255 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2642-255 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2642-255 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 



TGAGGGTGTT TCCAGAAGAG ACTAGCATTT GAATCTGAAG TGAGTAA 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2645-118 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2645-118 

(B) LOCATION: 1 . . 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2645-118 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
CACAAATTAA TTGCATTGTT ATAGGCTAGC AATGAAGAAT CTGAAAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2647-368 

(B) LOCATION: 1. . 47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 



(ix) 



FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2647-368 
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(B) LOCATION: 1..23 
(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2647-368 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 9: 
TTAAGGCCTT CAACTGATTA GACAAGGCCC ACTCACATTA TCTGACA 4 7 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:. 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2649-107 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2649-107 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-264 9-107 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
CACAACTCTG GAGCCTTTTA TGAACAGGAC AGCAATGCAC TGAAACT 4 7 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ixj FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2103-270 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID1 _ 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base c; g in SEQ ID1 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2103-270 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2103-270 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CTTGGATTCA TATGAGACAG CTACCAGACC TTCAATTTTT CTACACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 
<D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2228-301 

(B) LOCATION: 1. . 47 .. 

(D) OTHER INFORMATION: variant version of SEQ ID2 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID2 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2228-301 

(B) LOCATION: 1..23 



_9904038A2_I_> 



WO 99/04038 



35 



PCT/IB98/01193 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2228-301 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



CCCTGCTTAT CCCTGTAAGG TGGGGACCCA TATGGGCAAG GCCAGAC 4 7 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2229-240 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID3 

(ix) FEATURE: 

(A) NAME/ KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID3 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



TCGTCATCGT GGCCTGGGCT ACATACTACC TGTTCCAGTC CTTCCAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



9904038A2 I > 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-224 0-201 

(B) LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID4 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID4 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2240-281 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2240-281 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
GCAATCTTAA TAACTTTTTA TTTTAGTAAT TCGAATCTTT TTTTTCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2242-206 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ IDS 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID5 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2242-206 

(B) LOCATION: 1..23 



_990403SA2J_> 
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(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2242-206 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



GTGTTTTCTT TTAGTCAAAT TAT TT TAT AT TTTACTTTTT TCTTAAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2244-83 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID6 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID6 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2244-83 

(B) LOCATION: 1 . . 23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2244-83 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 56: 



TAATTGTAGA TACTAAGACC ATTGTGCTTA AACCATGTAG GTACTGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-224 6-34 0 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID7 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2246-340 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2246-340 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 



ATTTATATGT TAAATGCAGA GAAGAAGAAA AATAAGTTTT GCAGTAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

. (C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME / KEY : polymorphic fragment 99-2248-76 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID8 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID8 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2248-76 

(B) LOCATION: 1..23 



9904O38A2 I > 
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(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2248-76 

(B) LOCATION: complement 25.. 41 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
GACAGAGAGG GAAGGTAATC TTCTCCTGAA GTCTGCCCAT CCCCTGG 4 7 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2250-236 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID9 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID9 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: complement 25.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



ATGTATCCAA AACAGAATTA ACATACTTTG GGTTTTTTAT TTTTATT 4 7 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



WO 99/04038 



PCT/IB98/01193 



40 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2251-151 

(B) LOCATION: 1..47 

(0) OTHER INFORMATION: variant version of SEQ ID10 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID10 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: €0: 
TGAAAAGAAG TTCAGACGAT TGCGGATAGA CTAGTTTGGC TGTTGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2269-179 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID11 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID11 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2269-17 9 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2269-179 

(B) LOCATION: complement 25.. 47 

r 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
AAAATAAAGA AATTCCTAGA GACGTACAGC CTATCAAGAT CAAACCA 4 7 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2271-403 

(B) LOCATION: 1..47 

(D)- OTHER INFORMATION: variant version of SEQ ID12 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID12 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2271-403 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2271-403 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
AGGCATTTAT TTCATATTTA TTAGCCTTGA TTTTCTTATC TTCAAGT 4 7 



(2) INFORMATION FOR SEQ ID NO: 63: 

( i ) SEQUENCE . CHARACTERISTICS : 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: * 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2272-4 09 
(D) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID13 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID13 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2272-409 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2272-409 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



AAAAGCACTG CAATTATTTT GGATACTGTG AAATATTGCA AGTTTTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : . SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2273-528 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID14 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID14 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2273-528 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2273-528 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
ACTTGAAGAT AAGAAAATCA AGGTTAATAA ATATGAAATA AATGCCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2275-4 66 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID15 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID15 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2275-466 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
TTGATGATAG CATTAAATAC TCCTAAAAAC TGTGAATAGG GATACTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2278-276 
(D) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID16 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID16 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2278-276 

(B) LOCATION: 1...23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2278-276 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GAAAAAAATG GGAACATCTT CACGGCCTGT GCATCTCCAA CAAGATT . 4 7 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) , STRANDEDNESS: SINGLE . .. 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2312-358 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID17 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

<D) OTHER INFORMATION: base t; c in SEQ ID17 
(ix) FEATURE: 

(A) .NAME /KEY: Potential microsequencing oligo 99-2312-358 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2312-358 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
TTGAAGAGAG AGATGGAAAA AAATGTAGGC CTTCTGGGTA AATGGCC 4 7 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2315-213 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID18 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID18 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2315-213 

(B) LOCATION: complement 25.-47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
AGATGGATTC TACCCACAGG CAAGAGAAAA CCTTATTTTA AAAATAA 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TQPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2320-292 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID19 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID19 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: 1..23 

(ix) FEATURE : 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
ACTCTCATTC ACTAAACTTC AACTGTTTTT ATAAATTTAA TGAATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
(CJ STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2321-82 
. (B) LOCATION: 1...47 

(D) OTHER INFORMATION: variant version of SEQ ID20 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID20 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2321-82 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2321-82 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
TAAAGCTTAC TGAGTGTCCA CTCTGGATAC CTACTCAAAT ATTTCCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2324-338 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID21 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID21 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2324-338 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2324-338 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
AGATAGAAGA CAAAATCGCA GGACAAGAAA TCCCTCAACA GTAAAAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2333-423 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID22 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID22 
(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2333-423 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2333-423 

(B) LOCATION: complement 25.. 47 

(xij SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
GAGACGCTAT CTATGCAAGG AGGTTGTTCA ACATTTGGAC AGCCACG 47 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2341-485 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID23 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID23 
(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2341-485 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2341-485 

(B) LOCATION: complement 25.-47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 



ACACATCTGT CTGTTACCTA CACTTTACAA AGAATCGCAC AGGCTCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2342-217 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID24 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID24 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2342-217 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2342-217 

(B) LOCATION: complement 25 ..4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



TAGAGCCTTG GACTTTCATG ACATTTCTAG AAACAGCCCA GATTGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2362-270 

(B) LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID25 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID25 
(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
TCTCTCTTGG GTGGTTCCTC AACGTGTGTG ACCTTGACCA AGTATTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2364-329 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID26 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; g in SEQ ID26 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2364-329 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2364-329 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 



ATATAAAATG ATGAACCATA TACCTGAGGC AAGGTAACAT ATAATTG 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2367-61 

(B) LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID27 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID27 
(ix) FEATURE: 

(A) NAME/ KEY : Potential microsequencing oligo 99-2367-61 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/ KEY : Potential microsequencing oligo 99-2367-61 

(B) LOCATION: complement 25.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 



TAAACATTTC ATTATTTCAG AAAGTAATAT GCATTTTCAC * CAACACA 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ixj FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2371-93 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID28 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID28 
(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2371-93 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2371-93 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
CTCTAAACTT TCCTAATACT TACCTCACTG CCTACTTTTT ACATAAT 4 7 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

{vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE : 

(A) NAME/KEY: polymorphic fragment 99-2378-200 

(B) LOCATION: 1. ,47 

(D) OTHER INFORMATION: variant version of SEQ ID29 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID2 9 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 



GAGAACTTCC TGTTGAACCT GTTGTAGAAC TGTCCTGTCG TCCAAGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2381-394 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID30 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID30 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2381-394 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 



AGTGGTCTTC AGGTTATTGG TAGGGAAAAG TAGGGGAGCT AAAGGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2413-368 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID31 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID31 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2413-368 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2413-368 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 



ATTTTAAGAG GAAAACTTAA TGGGAGAATT GTACATAATA TTTCATT 4 7 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2419-285 

(B) LOCATION: 1..47. 

(D) OTHER INFORMATION: variant version of SEQ ID32 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID32 
(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(AJ NAME/KEY: Potential microsequencing oligo 99-2419-285 
(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
AAGGGATCAA GCAGTGCCCA CTCTCCACCC TCCAGGGAGC TGTGACT 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2559-253 

(B) LOCATION: 1. .47 

<D) OTHER INFORMATION: variant version of SEQ ID33 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID33 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2559-253 

(B) LOCATION: complement 25.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
CAGGTGTTTT CATGCCCTCT TAGTGTGTGT CACATCATCC ATCTCAA 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 
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<ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2566-112 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID34 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID34 
(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2566-112 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2566-112 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
GCCTTCACAA CCGCAGAGGC AAGGGAAGGA GCTTGGCCAC CCTGACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 
(Dj TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2567-329 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID35 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID35 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2567-329 

(B) LOCATION: 1. .23 
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(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2567-329 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
CACTGTCAGA TATGAAATGA TGCTTGGCTT TCTTTGGGCT ATATTTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2570-218 

(B) LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID36 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID36 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2570-218 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
GGAAAGTTCC AAATTATGAG AAGTGAGGCC TCTGAAGTGG CTAAGTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2571-242 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID37 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID37 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2571-242 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2571-242 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
ATAATGAATG AGTATTTGAT ATTGTATAAT TAAATGTGTC AGCATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE. CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2610-121 - 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of. SEQ ID38 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID38 
(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2610-121 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME /KEY : Potential raicrosequencing oligo 99-2610-121 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 
ATACCCCTTC CCTACGTATG GCTCTATCCT GCACTTAGAA AATTCTC 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2615-83 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID39 

(ix) FEATURE: 

(A) NAME/ KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID39 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2615-83 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2615-83 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
AACAAATCAC AAGTTGGCAA AAG TAG C AAA TTCTCATCTT CTGGGAA 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2620-227 

(B) LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID40 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID40 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2620-227 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2620-227 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
TTGACTGGGC TCCTGATGTG TCCGGGGTAT CTTGCTGGCT GTTTTGC 4 7 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 4 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2624-407 

(B) LOCATION: 1..44 

(D) OTHER INFORMATION: variant version of SEQ ID41 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID41 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2624-4 07 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2624-407 

(B) LOCATION: complement 25.. 44 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
ATCTGGCCAT AGGCAGAACA TTGTGGGAGA GATGGGGAAA GAGA 4 4 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2625-70 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID42 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID4 2 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2625-70 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2625-70 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
AGTGACTCAA CCAGAAAGAG AGCGGGAGAG AGGACGAAGA GAGGAGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2630-67 

(B) LOCATION; 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID4 3 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID4 3 
(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2630-67 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2630-67 

(B) LOCATION: complement 25.-47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 



TAAATTCTGC CTAGAAGATT AAGGTTGGTC CAGAACAGGG AGTGTTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2633-129 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID4 4 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID44 
(ix) FEATURE: 

(A) NAME / KEY : Potential microsequencing oligo 99-2633-129 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2633-129 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 



TAGCTATTTC TTCCCCTAGG CAACGTAGAC AATGAGAGAA CCCTTGA 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2634-341 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID45 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID45 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2634-34 1 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY: Potential microsequencing oligo 99-2634-341 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



GGAATCAATA TTTATTTATT ATCGACAGGT GAGACATTAT TTATTTA 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-2637-20 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID46 

(ix) FEATURE: 

(A) NAME/ KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID4 6 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2637-28 

(B) LOCATION: 1 . . 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2637-28 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 



CCATCACTTC CTCCTAGTGA AAAGTCAAAG GAGGGTGGGT TTTATAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2642-255 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID47 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2642-255 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2642-255 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
TGAGGGTGTT TCCAGAAGAG ACTGGCATTT GAATCTGAAG TGAGTAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY : polymorphic fragment 99-2645-118 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID48 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID48 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligd 99-2645-118 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2645-118 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 



CACAAATTAA TTGCATTGTT ATATGCTAGC AATGAAGAAT CTGAAAA 4 7 



<2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-264*7-368 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID49 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID49 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2647-368 

(B) LOCATION: 1 . . 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2647-368 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 



TTAAGGCCTT CAACTGATTA GACGAGGCCC ACTCACATTA TCTGACA 4 7 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-264 9-107 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID50 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; a in SEQ ID50 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2649-107 

(B) LOCATION: 1. . 23 
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(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-2649-107 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 



CACAACTCTG GAGCCTTTTA TGATCAGGAC AGCAATGCAC TGAAACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID1, SEQ ID51 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 



CCTGGATTCT GACCCATC 18 



(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

- (A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID2, SEQ ID52 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 



TCTACCTCTA CCTCTTTC 18 
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(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID3, SEQ ID53 

(B) LOCATION: 1. . 19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 



CTTCCCATAC CTCTGATAC 19 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID4 f SEQ ID54 

(B) LOCATION: 1 . . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 



TTCAACAGTG AAGCCATC 18 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ IDS, SEQ ID55 
(D) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
TGATGTGTGT GACTCAGG 18 

(2) INFORMATION FOR SEQ ID NO: 106: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens^ 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID6, SEQ ID56 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
ATAGAGGAAC CAAACCTG 18 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) s TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID7, SEQ ID57 
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(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

AGCAGCATGG AAGCAAAC 18 



(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLbGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID8, SEQ ID58 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
CTGATGAAAG TGGCTCTC 18 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID9, SEQ ID59 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 



TGTATCTGAG GTCTAAAAC 19 
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(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID10, SEQ IDGO 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



TATATGTAGA GGGTGAGG 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens / 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID11, SEQ ID61 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 



AGGCTAAGAA AAAAAGAGG 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: upstream amplification primer for SEQ. ID12, SEQ ID62 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 
TGAAAAGACT AAGTTCTGG 19 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID13, SEQ ID63 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
ATGCTAGAGG AAAGGAAC 18 

(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: upstream amplification primer for SEQ ID14, SEQ ID64 

(B) LOCATION: 1. . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 
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ATACCAGGGA CTTTAGTG 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID15, SEQ ID65 

(B) LOCATION: 1. . 19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 
AGATTCAGAC CAATTTCAC 19 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID16, SEQ ID66 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
TGCTTTGATT TGACCCTG 18 



(2) INFORMATION FOR SEQ ID NO: 117: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID17, SEQ ID67 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 
GCCTATCTTG TTTTGACTG 19 

(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID18, SEQ ID68 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 
TTCAGAGCAA CAATTTTGG 19 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID19, SEQ ID69 

(B) LOCATION: 1 . . 20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
CCAAGTTTAT GAGATTAGAG 20 



(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID20, SEQ ID70 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
CTAACCTAGA TGATCTTCC 19 



(2) I N FORMAT I ON FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID21, SEQ ID71 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 
TGTCCCAAGT TTAGTTCC 18 
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(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID22, SEQ ID72 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 



CCAGGAATAA TACTTTGCAT C 21 



(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
<D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID23, SEQ ID73 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 



CTCAGTTTTT CTTTCCACC 19 



(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 



ISDOCID: <WO. 



_9904038A2J_> 



WO 99/04038 



77 



PCT/IB98/01193 



(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID24, SEQ ID74 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 
GACTCAGGCA CAACTTTTAG 20 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID25, SEQ ID75 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 
TACAGCAATG GTATAAAGC 19 



(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: upstream amplification primer for SEQ ID26, SEQ ID76 
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(B) LOCATION: 1..20 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

TTATCCATCA TTTAGAAGGC 20 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vij ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID27, SEQ ID77 

(B) LOCATION: 1 . . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
CACTGGAGAT AGCTGAAC 18 



(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID28, SEQ ID78 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 
GTACTGTCAA ATCATCACC 19 
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(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID29, SEQ ID79 

(B) LOCATION: 1 . . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 



CGGGCATAAA AATGCAGG 



(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

<B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: upstream amplification primer for SEQ ID30, SEQ ID80 

(B) LOCATION: 1. .20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 



GTATATGTGA AGGTTGTGGG 



(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE : 

(A) NAME/KEY: upstream amplification primer for SEO ID31, SEO ID81 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

J 

GTAACATGTG ACTTGCTCC 19 



(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID32, SEQ ID82 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 
CCAGCTTGAA TTTTGGTGAG 2 0 



(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: upstream amplification primer for SEQ ID33, SEQ ID83 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 
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GCATATCTTG GTGGTCTG no 

1 o 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID34, SEQ ID84 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:' 134: 
AGGGTTCAAA GGAAGGAGG 19 



(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: upstream amplification primer for SEQ ID35 f SEQ ID85 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 
GAAAAAGAAG GGAAAGAAAG 



20 



(2) INFORMATION FOR SEQ ID NO: 136: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID36, SEQ ID86 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 



GTTTGTCTTG GCTATTAAG 



(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID37, SEQ ID87 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 
TGAAAAAGTG GGTAGCAG 18 



(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID38, SEQ ID88 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 



ATATCAGGGC AGGCACAAG 



(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

- (ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID39, SEQ ID89 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 



GGAAGAGGGC AACTTTAC 



(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID40, SEQ ID90 

(B) LOCATION: 1..18 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 



TGAAATGGGC TGTAGATG 
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(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) . STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID41, SEQ ID91 

(B) LOCATION: 1..18 

(xij SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
TTAAACCTTG GCTTCCTG 18 



(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID42, SEQ ID92 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
TTCAACCTTT TGTCGCTG 18 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 



DOCID: <WO_ 



_990403SA2_I_> 



WO 99/04038 PCT/IB98/01 193 

85 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for 5EQ ID43, SEQ ID93 

(B) LOCATION: 1. . 19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 3: 
ATGTAACAGA TGTCCAAAG 19 



(2) INFORMATION FOR SEQ ID NO: 14 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(BJ TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID44, SEQ ID94 

(B) LOCATION: 1. . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 
CTAAGGGTCT TCTTTCTG 18 



(2) INFORMATION FOR SEQ ID NO: 14 5: 

(i) SEQUENCE CHARACTERISTICS:- 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID4 5, SEQ ID95 
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(B) LOCATION: 1..19 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

GGTGTATTTA GGTTTGTGG 19 

(2) INFORMATION FOR SEQ ID NO: 14 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID4 6, SEQ ID96 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 6: 
CTACCATCAC TTTCCTCC 18 

(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: ' 

(A) ORGANISM: Homo sapiens . 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEQ ID47, SEQ ID97 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 7: 
ATAACTAGGC ATCCAGAC 18 
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(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(DK TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID48, SEQ ID98 

(B) LOCATION: 1. . 20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 
CG AC AT AATT TGGTATGTAG 20 



(2) INFORMATION FOR SEQ ID NO: 14 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID49, SEQ ID99 

(B) LOCATION: 1. . 18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 9: 



TCACCAAGTG TCATCGTC 18 
(2) INFORMATION FOR SEQ ID NO: 150: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs o 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID50, SEQ 

ID100 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

GAGACTTTGT AACTTTGTG 10 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) 7 NAME/ KEY : downstream amplification primer for SEQ ID1, SEQ 

ID51 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

GTCTTCATAA GTCTTCAGTG 20 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID2, SEQ 

ID52 
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(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

CAAAACACTC CCTCACAC 18 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE : NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID53 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID3, SEQ 
<B) LOCATION: 1. .18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

CAGGTGATGT CTGGATAC 18 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID54 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID4, SEQ 

(B) LOCATION: 1. .21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

AAGACAACAA GAACTAAATC C 21 
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(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID5, SEQ 

ID55 

(B) LOCATION: 1..20 

(xij SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

TCCCCAATAG ATTAAAGTTC 20 



(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
<C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID56 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID6, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

CTGAGCATCA AATAGGAG 18 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

ID5? (A) NAME /KEY : downstream amplification primer for SEQ ID7, 5EQ 

(D) LOCATION: I. .21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

TCATTACAGA AAAAGCCAAA G 2 1 



(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID58 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID8, SEQ 

(B) LOCATION: 1..20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 
TCCTTCTCCA CCTAAAATTC 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



20 
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(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID9, SEQ 

ID5 9 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 5.9 : 

ACTGCTTCTG CTCTCTTG 18 



(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 
' (B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

( D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID60 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID10, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

TGAACATACA AAAACACTGG 20 



(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID61 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID11, SEQ 

(B) LOCATION: 1. .18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 
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AGAGTTGTTG GCATGTAG 18 



(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID62 



(ix) FEATURE: 

(A) NAME/ KEY : downstream amplification primer for SEQ ID12, SEQ 

(B) LOCATION: 1..18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 
AACTGCTCAG CAACTGTG 18 

(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
<D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID63 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID13, SEQ 

(B) LOCATION: 1..21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 
TTAGAACACT TTTATGGGAA C 21 

(2) INFORMATION FOR SEQ ID NO: 164: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID14, SEQ 

(B) LOCATION: 1..19 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

GTCCTAGAAT GAGCAAATG 19 



(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID65 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID15, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

AGAGAAAGAA CCAGAGCC 18 

(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID16, SEQ 

I OSS 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 166: 

TGGAGTCTAA ACTAGGTG 18 



(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE : 

(A) NAME/KEY: downstream amplification primer for SEQ ID17, SEQ 

(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 



GGACCTTTTA AGAGTGTG 



(ix) 

ID67 



(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 
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(A) NAME/ KEY : downstream amplification primer for SEQ ID18, SEQ 

ID68 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

TGGTTTCTTC AAACAAGAG 19 



(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID69 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID19, SEQ 

(B) LOCATION: 1. .21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

AAGTTGGATA ACCTTCTTTT G 21 

(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID70 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID20, SEQ 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 
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TAGTTTCGTG AACTTATCC !9 



(2) INFORMATION FOR SEQ ID NO; 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 



ID71 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID21, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

GTTTACATTA TGCCCCTTTT C 21 



(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID72 



(ix) FEATURE: 

(A) NAME /KEY: downstream amplification primer for SEQ ID22, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

CTCCACTGCC ACAACTTC 18 

(2) INFORMATION FOR SEQ ID NO: 17 3: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

NA ME/KEY: downstream amplification primer for SEQ ID23, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

TGCTCTGCTT GTAATGTTAT G 21 



(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID7 4 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID24, SEQ 

(B) LOCATION: 1..19 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
CAAGGTTGCC AGTCACATC 19 

(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
. (D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



3DOCID: <WO. 



9904038A2_I_> 



WO 99/04038 



99 



PCT7IB98/01193 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

id75 < A > NAME /KEY : downstream amplification primer for SEQ ID25, SEQ 

(D) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

ATGAAGATAC GCAGCCAG 18 



(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID7 6 



(ix) FEATURE: 

(A) NAME /KEY: downstream amplification primer for SEQ ID26, SEQ 

(B) LOCATION: 1..21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 
CTCATTTAAC TCCCATTCCT C 21 



(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE : 

(A) NAME/KEY: downstream amplification primer for SEQ ID27, SEQ 

(B) LOCATION: 1..21 



(ix) 

ID77 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 
TGCTTTTCTT GTCCCTGATT G 21 



(2) INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID78 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID28, SEQ 

(B) LOCATION: 1..20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 
GCATTGAATC CGTAAATTTC 20 

(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID29, SEQ 

ID7 9 

(B) LOCATION: 1. .21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

CAGTTTTGGT CATTGTGGGA G 21 
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(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID30, SEQ 

ID80 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 



AAATCCAACT ATGTCACTTC C 21 



(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID81 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID31, SEQ 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

AATGT CCCCT CCTCCTCTG 19 



(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID32, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

GCCACAAGTA TTTGGGTGCC 20 



ID82 



(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID83 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID33, SEQ 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

CCTACGGTTT GTCATAAAG 19 



(2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



9904038A2J_> 
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(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID34, SEQ 

ID84 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

TGTAACAGGG GACATGGGAA G 21 



(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID85 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID35, SEQ 

(B) LOCATION: 1. .20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

CAATTTTGTA TGGATGACAG 20 



(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID86 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID36, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 



QftrumflA? I > 
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TGGTGGTGGA AAAAAAGAAG G 



21 



(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID37, SEQ 



(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID38, SEQ 



ID87 



(B) LOCATION: 1..21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 



CTATAACTCT TATCAGTGAA C 



21 



ID88 



(B) LOCATION: 1-.20 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 188: 



AGGTCACTCA AGTATTATGG 



20 



(2) INFORMATION FOR SEQ ID NO: 189: 



9904O38A2 I > 



WO 99/04038 



105 



PCT/IB98/01193 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: downstream amplification primer for SEQ ID39, SEO 

(B) LOCATION: 1. .21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 9: 

CCCCAGCTCC CAAATAATGA C 21 



ID89 



(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID90 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID40, SEQ 

(B) LOCATION :~ 1. .20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

TCCACAACAG ACACTTAAAC 20 



(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID41, SEQ 

ID91 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

TCTCTTTCCC CATCTCTC !8 

(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) StRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID92 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID42, SEQ 

(B) LOCATION: 1..19 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

TCCCCTTCTA TTGTCTACC 19 

(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID93 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID4 3, SEQ 
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(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 



GGTTTGTGTT CAGTACGG 



(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

<B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

<ix) FEATURE: 

D34 (A) NAME/KEY: downstream amplification primer for SEQ ID44, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 



TGTATATGCC TGGTGGAAAT G 



(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

[D95 (A) NAME/KEY: downstream amplification primer for SEQ ID4 5, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 



GTGAAAGAAA CTTGATAGAG G 
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(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID96 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 6, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

CCTCCAACAG TAAGAATC 18 



(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID97 



(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID47, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

CAGAACCATT AACTATTCAC 20 

(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID48, SEQ 

ID98 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

GCCATTTGGA ATTTTGATAG 20 



(2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 9, SEQ 

ID99 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 

TGCAGCATCC CTGGAAGTC 19 



(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME /KEY: downstream amplification primer for SEQ ID50, SEQ 

ID100 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 



GAGACATCAT ATCTGTGTTT G 21 



(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2103-270 . misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 



e 

GAT T CAT AT G AGACAGCTA 19 



(2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID ^ 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2228-301 .rnisl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 



tSDOCID: <WO. 
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CCCTGCTTAT CCCTGTAAGG TGG 



(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE : 

(A) NAME /KEY : potential microsequencing oligo 99-2229-240 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 



TCGTCATCGT GGCCTGGGCT ACA 



(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2240-281 .misl 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 



TCTTAATAAC TTTTTATTT 



(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo 99-2242-206 . misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 



TTTCTTTTAG TCAAATTAT 



(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-224 4 -83 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 



TAATTGTAGA TACTAAGACC ATT 23 



(2) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 



9904038A2_I_> 
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(A) NAME /KEY: potential microsequencing oligo 99-224 6-340 . misl 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 
ATTTATATGT TAAATGCAGA GAA 23 



(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2248-76 .misl 

(B) . LOCATION: 1. . 19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 
GAGAGGGAAG GTAATCTTC 19 



(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo 99-2250-236 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209: 
TTTTATCCAA AACAGAATTA ACA 23 
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(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2251-151 . misl 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 



TGAAAAGAAG TTCAGACGAT TGC 23 



(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2269-17 9 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 



AAAATAAAGA AATTCCTAGA GAC - 23 



(2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



9904Q38A2_L> 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2271-403 . misl 

(B) LOCATION: 1..23 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 
AGGCATTTAT TTCATATTTA TTA 23 

(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2272-409 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 
AAAAGCACTG CAATTATTTT GGA 23 

(2) INFORMATION FOR SEQ ID NO: 214: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo 99-2273-528 . misl 

(B) LOCATION: 1..19 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 
GAAGATAAGA AAATCAAGG 



(2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo 99-2275-4 66 . misl 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 
TGATAGCATT AAATACTCC 



(2) INFORMATION FOR SEQ ID NO: 216: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2278-276 .misl 

(B) LOCATION: 1 . . 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 
GAAAAAAATG GGAACATCTT CAC 23 



(2) INFORMATION FOR SEQ ID NO: 217: 



9904O38A2J_> 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo 99-2312-358 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 



TTTTAGAGAG AGATGGAAAA AAA 23 



(2) INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: potential microsequencing oligo 99-2315-213 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 



AGATGGATTC TACCCACAGG CAA 



(2) INFORMATION FOR SEQ ID NO: 219: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo. 99-2320-292 . misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 
TCATTCACTA AACTTCAAC 19 



(2) INFORMATION FOR SEQ ID NO: 220: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-232 1-82 . misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 
GCTTACTGAG TGTCCACTC 19 



(2) INFORMATION FOR SEQ ID NO: 221: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo 99-2324-338 .misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 
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AGAAGACAAA ATCGCAGGA 



(2) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2333-423. misl 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 



GAGACGCTAT CTATGCAAGG AGG 



(2) INFORMATION FOR SEQ ID NO: 223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-234 1-4 85 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 



TTTTATCTGT CTGTTACCTA CAC 



(2) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo 99-231 2-2 17 .mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 
TTTTGCCTTG GACTTTCATG ACA 23 



(2) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

( D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2362-270 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 
TCTCTCTTGG GTGGTTCCTC AAC 23 



(2) INFORMATION FOR SEQ ID NO: 226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 



9904038A2 I > 
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(A) NAME /KEY : microsequencing oligo 99-236-1 -329 . mis 1 

(B) LOCATION: 1 . . 19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226: 



AAAATGATGA ACCATATAC 



C!) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME /KEY : potential microsequenci nq oli<jo 99-2 367- G 1 . mis 1 

(B) LOCATION: 1..23 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 



TAAACATTTC ATTATTTCAG AAA 



C:j INFORMATION FOR SEQ ID NO: 228: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2371-93 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 



TTTTAAACTT TCCTAATACT TAC 



23 
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(2) INFORMATION FOR SEQ ID NO: 229: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(H) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii.) MOLECULE TYPE : DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing olivjo 99-2378-1:00 . mis 1 

(B) LOCATION: 1. .23 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 



C.IAGAACTTCC TGTTGAACCT GTT 



(2) INFORMATION FOR SEQ ID NO: 230: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY: potential microsequencing oligo 99-2381-394 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 



AGTGGTCTTC AGGTTATTGG TAG 



(2) INFORMATION FOR SEQ ID NO: 231: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microscqucncincj oiiqo l >')-21 1 t- id'H.mis I 

( B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 
ATTTTAACIAC C IAAAACTT AA TGG 2 j 



(2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencincj oligo i C )-2H f> . rni.-j i 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232: 
GATCAAGCAG TGCCCACTC 19 

(2) INFORMATION FOR SEQ ID NO: 233: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: potential microsequencing oligo 99-2559-253 . mis 1 

(B) LOCATION: 1..23 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 233: 
CAGGTGTTTT CATGCCCTCT TAG 23 

C!) INFORMATION FOR SEQ ID NO: 2 34: 

(L) UKOUENCE CHARACTERISTICS: 

(A) LENGTH : 23 base pairs 
(M) TYPE: NUCLEIC ACID 
(C) STRANDEDNESS: SINGLE 
<D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY : potential microsequencing olicjo <)9-25b6- 1 12 . rnisl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 234: 
CCCTTCACAA CCGCAGAGGC AAG 2 3 

(2) FN FORMAT I OH FOR SEQ ID NO: 235: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
<D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2567-329 .misl 

(B) LOCATION: 1 . . 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235: 
CACTGTCAGA TATGAAATGA TGC 2 3 



(2) INFORMATION FOR SEQ ID NO: 236: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) 5TRANDEDNE5S : SINGLE 

(D) TOPOLOGY : LINEAR 

(ii) MOLECULE TYrE : DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsoquonciruj oligo W-^SVO-Ul U .mial 
(DJ LOCATION: 1..10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 236: 



AGTTCCAAAT TATGAGAAG 



(2) INFORMATION FOR SEQ ID NO: 237: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microseauencing olicjo 99-257 1 -2 4 2 . mis 1 

(B) LOCATION: 1 . . 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 237: 



ATAATGAATG AGTATTTGAT ATT 



(2) INFORMATION FOR SEQ ID NO: 238: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2610-121 . nisi 

( D ) LOCATION: 1..23 

(:-:i) rtKQUENCE DESCRIPTION: SEQ ID NO: 238: 
TTTTCCCTTC CCTAGGTATG GCT 2 \ 

(2) INFORMATION FOR SEO ID NO: 239: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i>:) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2 61 5-8 3 . mis 1 
(D) LOCATION: 1 . . 23 

(>:L) SEQUENCE DESCRIPTION: SEQ ID NO: 239: 
TTTTAATCAC AAGTTGGCAA AAG 2 3 



(2) INFORMATION FOR SEQ ID NO: 240: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2620-227 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240: 
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TTGACTGGGC TCCTGATGTG TCC 



12) INFORMATION FOR SEQ ID NO: 24 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 b.i.su pair? 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOrOLOGY: LINEAR 

(.ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(L>:) FEATURE: 

(A) NAME /KEY : potential microsequenciruj oli-jo 99-2624-407 .misl 

(B) LOCATION: 1..23 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 
ATCTGGCCAT AGGCAGAACA TTG 2 3 



(2) I N FO RMAT I ON FOR SEQ ID NO: 24 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

( D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

<D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2 625-70 . mis 1 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 
AGTGACTCAA CCAGAAAGAG AGC 2 3 



(2) INFORMATION FOR SEQ ID NO: 24 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Lx) FEATURE: ' 

(A) NAME/ KEY : potential microscquoncimj ol ivjo ?> c )-2ti J0-(i7 . mi;.; I 
<H) LOCATION: 1 . . 23 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 2-13: 



TAAATTCTCC CTAGAAGATT AAG 



(2) INFORMATION FOR SEQ ID NO: 244 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(3) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Lx) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2033- 120 . mis 1 

(B) LOCATION: 1 . . 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 



TTTTTATTTC TTCCCCTAGG CAA 



(2) INFORMATION FOR SEQ ID NO: 245: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 
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(A) NAME /KEY : potential microsequencir.g oligo 99-2634-34 1 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 245: 



c;gaatcaata tttatttatt atc 



(2) INFORMATION FOR SEQ ID NO: 24 6: 

(J) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencirvj oligo 99-2637-28 . mis I 
(D) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 246: 



CCATCACTTC CTCCTACTGA AAA 



(2) INFORMATION FOR SEQ ID NO: 247: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2 64 2-255 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247: 



TGAGGGTGTT TCCAGAAGAG ACT 
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(2) INFORMATION FOR SEQ ID NO: 248: 

(i) SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE : NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(P) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) OR [GINAL SOURCE : 

{A) ORGANISM: Homo sapieru; 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequehcincj oligo 99-264 1 1 8 . mis 1 
(D) LOCATION: 1..23 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 248: 



CACAAATTAA TTGCATTGTT ATA 



(2) INFORMATION FOR SEQ ID NO: 24 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM- Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY: potential microsequencing oligo 99-2647-368 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 249: 
TTAAGGCCTT CAACTGATTA GAC 23 



(2) INFORMATION FOR SEQ ID NO: 250: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

<!>:) FEATURE: 

(A) NAME /KEY : microscquencing oligo 0<)-2<M')- i()7 . mi:; t 
(U) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250: 



ACTCTC ;CAGC CTTTT ATGA 



(2) INFORMATION FOR SEQ ID NO: 251: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDMESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequenc i ruj oiijo 99-2101-270. mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251: 



AGTGTAGAAA AATTGAAGGT CTG 



(2) INFORMATION FOR SEQ ID NO: 252: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2228-301 . mis2 

(B) LOCATION: 1..19 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252: 
GGCCTTGCCC ATATGGGTC 



(2) INFORMATION FOR SEQ ID NO: 253: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 19 baso pairs 

(0) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2220-24 0 , mis2 
<B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 
AAGGACTGGA ACAGGTAGT 



(2) INFORMATION FOR SEQ ID NO: 254: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2240-281 .mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 254: 
AGAAAAAAAA GATTCGAATT ACT 23 



(2) INFORMATION FOR SEQ ID NO: 255: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY : LINEAR 

(ii) MOLECULE TYrE: DNA 

(v[) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

( Lx) FEATURE: 

(A) NAME /KEY : potential microscqucncinq oliijo l )V-224 :»-20(i. mi ;.;2 
(M) LOCATION: L..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255: 
CTTAAGAAAA AAGTAAAATA TAA 2 J 



(2) I N FORMAT I ON FOR SEQ ID NO: 256: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(0) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-22*14 -83 . mis2 

(B) LOCATION: 1...19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256: 
TACCTACATG GTTTAAGCA 19 

(2) INFORMATION FOR SEQ ID NO: 257: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-22«l 6-34 0 . misZ 

(H) LOCATION: 1..19 

(xL) SEQUENCE DESCRIPTION: SEQ ID NO: 257: 
T< JCAAAACTT ATTTTTCTT 10 

C!) I N FORMAT I ON FOR SEQ ID NO: 258: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(U) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-224 8-7 (3 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 258: 
CCAGGGGATG GGCAGACTTC AGG 2 3 

(2) INFORMATION FOR SEQ ID NO: 259: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2250-236 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259: 



O904O3aA2 I > 
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AATAAAAATA AAAAACCCAA AGT 



12) INFORMATION FOR SEQ ID NO: 260: 

(L) SKQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(H) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY : LINEAR 

(i.L) MOLECULE TYPE : DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-225 1- 1 51 . tnis2 

(B) LOCATION: 1 . . 23 

<:•:!) SEQUENCE DESCRIPTION: SEQ ID NO: 260: 



TTTTACACCC AAACTAGTCT ATC 



(2) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2269-17 9 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 



TTGATCTTGA TAGGCTGTA 



(2) INFORMATION FOR SEQ ID NO: 262: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) 5TRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo rsapicns 

(ix) FEATURE : 

(A) NAME/ KEY: microsoqiumcincj oligo f )<)-227 1 -A 0 J . mi n2 

( B) LOCATION: 

Jxi) SKQUENCE DESCRIPTION: SEQ ID NO: 2(i2: 
GAAGATAAGA AAATCAAGG 1 



(2) INFORMATION FOR SEQ ID NO: 2G3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(U) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

( i >: ) FEATURE : 

(A) NAME /KEY : microsequencing oligo 99-2272-4 0 9 . mis2 

(B) LOCATION: 1..23 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 263: 
TTTTACTTGC AATATTTCAC AGT 2 3 



(2) INFORMATION FOR SEQ ID NO: 264: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 
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(A) NAME /KEY : potential microseauencing oligo 99-2273-528 .mis2 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 261: 



AGGC ATT TAT TTCATATTTA TTA 



(M) INFORMATION FOR SEQ ID NO: 265: 

(L) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microseauencing oligo 99-227 5-4 66 . mis2 
(D) LOCATION: 1..23 

(XL) SEQUENCE DESCRIPTION: SEQ ID NO: 2 65: 
TAGTATCCCT ATTCACAGTT TTT 2 J 



(2) INFORMATION FOR SEQ ID NO: 266: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

<B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2278-27 6 . mis2 

(B) LOCATION: 1. .19. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 



TTGTTGGAGA TGCACAGGC 
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(2) INFORMATION FOR SEQ ID NO: 267: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) 5TRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYFE : DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: potential microsequencinq oLitjo 00-2312-358 ,mis2 

(B) LOCATION: 1..23 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 267: 



G( ICCATTTAC CCAGAAGGCC TAC 2 3 



(2) INFORMATION FOR SEQ ID NO: 268: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
{ B) TYPE: NUCLEIC ACID 
(CJ STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2315-2 1 3 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 268: 



TTTTTTTTAA AATAAGGTTT TCT 23 



(2) INFORMATION FOR SEQ ID NO: 269: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

( D ) TOPOLOGY : LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: potential microsequencatw oLirjo 9 c *-2320-2!l2 - mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 260: 
AAATTCATTA AATTTATAAA AAC 2 A 



(2) INFORMATION FOR SEQ ID NO: 270: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microseauonci:..; oli'jo 90-2 32 1-82 . mis2 
•(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 270: 
AGGAAATATT TGAGTAGGTA TCC 2 3 



(2) INFORMATION FOR SEQ ID NO: 271: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-2324 -338 . mis2 

(B) LOCATION: 1..23 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 271: 
TTTTTACTGT TGAGGGATTT CTT 

) 

(M) t N FORMAT TON FOR SEQ ID NO: 272: 

(L) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 
(H) TYPE : NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ixj FEATURE: 

(A) NAME/ KEY : microsequencing oligo 99-233 3-4 23 . mis2 
(D) LOCATION: 1 . . 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 272: 
TTTTCCTGTC CAAATGTTGA ACA 



(2) INFORMATION FOR SEQ ID NO: 27 3: 

(j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-234 1- 4 85 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 3: 
AGAGCCTGTG CGATTCTTTG TAA 



(2) INFORMATION FOR SEQ ID NO: 274: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(il) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential microsoquenci nq oli<jo <)<)-2 J-12-2 I 7 . mi^i2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 



C AC AAT C T GG GCTGTTTCTA GAA 



(2) INFORMATION FOR SEQ ID NO: 275: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

<ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2362-270 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 275: 



TTTTACTTGG TCAAGGTCAC ACA 



(2) INFORMATION FOR SEQ ID NO: 276: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-236*1 -329 . mis2 
(D) LOCATION: 1 . . 23 

(:-;i.) SEQUENCE DESCRIPTION: SEQ ID NO: 276: 



CAATTATATG TTACCTTGCC TCA 



(::) INFORMATION FOR SEQ ID NO: 277: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2367- GI . mis2 
(D) LOCATION: 1..23 

(>:i) SEQUENCE DESCRIPTION : SEQ ID NO: 277: 



TTTTTTGGTG AAAATGCATA TTA 



(2) INFORMATION FOR SEQ ID NO: 278: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2 37 1-93 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 278: 
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AT TAT GT AAA AAGTAGGCAG TGA 23 

(?.) INFORMATION FOR SEQ ID NO: 279: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 bast* pairs 

(1.1) TYTE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: rnicrosequencing oligo 99-2 378-200 . mis2 

(B) LOCATION: 1..19 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 279: 
GGACGACAGG ACAGTTCTA 19 

(2) INFORMATION FOR SEQ ID NO: 280: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: rnicrosequencing oligo 99-2 38 1 -394 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 280: 
TTTAGCTCCC CTACTTTTC 19 



(2) INFORMATION FOR SEQ ID NO: 281: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oliqo 1 5- 'Mill . in Ls2 

(UJ LOCATION: 1..19 

(>:L) SEQUENCE DESCRIPTION: SEQ ID NO: 2«i: 



AAATATTATG TACAATTCT 



(2) INFORMATION FOR SEQ ID NO: 282: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i>:) FEATURE: 

(A) NAME /KEY : potential microsequencinq olirjo 09-24 1 9-285 . rnis2 
(D) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 282: 
AGTCACAGCT CCCTGGAGGG TGG 2 3 



(2) INFORMATION FOR SEQ ID NO: 283: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 



icnnnirv ^\fjr\ 
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(A) NAME /KEY : rnicrosequencing oligo 99-2559-253 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 283: 
CATGGATGAT GTGACACAC 1 <) 



(A) INFORMATION FOR GEO ID NO: 284: 

(L) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 19 base pairs 
(13) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: rnicrosequencing oligo 99-2 Sou- 1 12 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 284: 
AGGGTCGCCA AGCTCCTTC X q 



(2) INFORMATION FOR SEQ ID NO: 285: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : rnicrosequencing oligo 99-2567-329. mis2 

(B) LOCATION: 1..19 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 285: 



TATAGCCCAA AGAAAGCCA 



19 
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(2) INFORMATION FOR SEQ ID NO: 286: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE : NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(H) TOPOLOGY: L I NEAR 

(i.L) MOLECULE TYPE: t)NA 

<vl) ORIGINAL SOURCE: 

(A) ORGANISM: Homo siapiitriu 

(ix) FEATURE: 

(A) NAME/KEY: potential mierosequencincj oliyo 90-2570-218 .mis 2 

(B) LOCATION: 1. .23 , 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 6: 
AACTTAGCCA CTTCAGAGGC CTC 2 3 



(2) INFORMATION FOR SEQ ID NO: 287: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
(») TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-257 1-24 2 . mis2 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 287: 
GCTGACACAT TTAATTATA 19 



(2) INFORMATION FOR SEQ ID NO: 288: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potonti.il microscqiicncinq oliqo '><)-2 C> 1 O- 1 2 i . mifli! 
(MJ LOCATION: I.. 2 3 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 288: 
GAGAATTTTC TAAGTCCAGC ATA 2 J 



(2) INFORMATION FOR SEQ ID NO: 289: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

( ix) FEATURE: 

(A) NAME /KEY : potential 'microsequen:ir.«i oli'jo 99-2G1 O-tf 3 .mii;2 
(D) LOCATION: 1 . . 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 289: 
TTCCCAGAAG ATGAGAATTT GCT 2 3 



(2) INFORMATION FOR SEQ ID NO: 290: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2620-227 . mis2 

(B) LOCATION: 1 . . 23 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 290: 
TTTTAACAHC CAGCAAGATA CCC 



CD INFORMATION FOR SEQ ID NO: 291: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
CD) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2G24-407,mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 291: 
TTTTCTCTTT CCCCATCTCT CCC 



(2) INFORMATION FOR SEQ ID NO: 292: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2625-70 . mis2 

(B) LOCATION: 1 . . 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 292: 
TTTTCTCTCT TCKTCCTCTC TCC 



PCT/IB98/01193 

• 



(2) INFORMATION FOR SEQ ID NO: 2 93: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(O) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequonci ncj oliyo 99-26.U)-(>7 . mi a 2 
(13) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 293: 
TTTTACTCCC TGTTCTGGAC CAA 23 

(2) INFORMATION FOR SEQ ID NO: 294: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2633- 12 9 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 291: 
TCAAGGGTTC TCTCATTGTC TAC 23 

(2) INFORMATION FOR SEQ ID NO: 295: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
( i >; ) FEATURE : 

(A) NAME /KEY : microsequencing oligo 99-263-1-34 1 . misC 

(D) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 95: 
TTTTTAAATA ATGTCTCACC TGT ;> * 

(2) INFORMATION FOR SEQ ID NO: 296: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii ) . MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i>:) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2C37-20 . mis2 
(Q) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 296: 
TTTTAAAACC CACCCTCCTT TGA 2 3 

(2) INFORMATION FOR SEQ ID NO: 297: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo 99-2642-255 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 297: 
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TCACTTCAGA TTCAAATGC 19 



(2 J INFORMATION FOR SEQ ID NO: 298: 

(L) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 Uiso pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(U) TOPOLOGY : LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-2 6-5 5- 1 18 . rnis2 
£B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 298: 
TTTTCACATT CTTCATTGCT AGC 2 3 



(2) INFORMATION FOR SEQ ID NO: 299: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(3) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2647-368 . mis2 
(3) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 299: 
AGATAATGTG AGTGGGCCT 



19 



(2) INFORMATION FOR SEQ ID NO: 300: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY : LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: potential microsoquoncimj ol iqo «l f l-26-l «1- I 07 . ni i ::2 

(B) LOCATION: 1..23 

(xi) iiKOUENCE DESCRIPTION: SEQ ID NO: 300: 
AGTTTCAGTG CATTGCTGTC CTG 2 3 



(2) INFORMATION FOR SEQ ID NO: 301: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(DJ TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY : polymorphic fragment 99-311 
(DX LOCATION: I. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-34 4-misl 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-344-mis2 

(B) LOCATION: complement 25.. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 301: 
TGCTGCCAAG GATCCATGTC AGCATGCTCC TCTCTGAGCC CTGGTCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 302: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 
(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(L:<) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-3<iO 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/ KEY : polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base t 

( i x J FEATURE : 

(A) NAME/KEY: microsequencing oligo 99-366-misl 

(B) LOCATION: 5.. 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-36b-mis2 
(D) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 302: 
AGGGCCTGCC TTCAGGGACA GCTTAGGAAA TGTTTGTTGA GTTAGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 303: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

. (A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-359 

(B) LOCATION: 1 . .47 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 
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(ix) FEATURE : 

(A) NAME/KEY: Potential microsequencing oligo 99-359-misl 
(D) LOCATION : 1..2 3 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-3!>9-miy2 
(R) LOCATION: complement 25.. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 303: 
CTACACAGTC ATCGCCTCCA • TCCGCTCTCA ACAAATCCTG GCAGCTC 4 7 

(2) INFORMATION FOR SEQ ID NO: 304: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/ KEY: polymorphic fragment 99-355 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-355-misl 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-355-mis2 

(B) LOCATION: complement 25.. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 304: 
GGAGTTTCGG GGAGTTTCGG GAGGGTTCCT GGGAAGAAGC TCCTCCC 4 7 

(2) INFORMATION FOR SEQ ID NO: 305: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 
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(B) TYPE; NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-365 
(M) LOCATION: 1. .48 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-365-:nisl 

(B) LOCATION: 5.. 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oli.jo 99-3G5-rnis2 

(B) LOCATION: complement 25.. 48 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 305: 



CCTACCAAGC AAGCAGCCCC AGCCTAGGGT CAGACAGGGT GAGCCTC 4 7 



(2) INFORMATION FOR SEQ ID NO: 306: 

(L) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
<B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-24 52 

(B) LOCATION: 1. . 47 

(D) OTHER INFORMATION: Extracted from sequence gb:M10065 

(3909. .3955) 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 
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(ix) FEATURE : 

(A) NAME/KEY: microsequencing oligo 99-24 52-misl 

<B) LOCATION: 5.. 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-24 52-mis2 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEO ID NO: 306: 



TC;CGCGCGGA CATGGAGGAC GTGCCCGGCC CCCTGGTGCA GTACCCC 4 7 



(2) INFORMATION FOR SEQ ID NO: 307: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-34 4 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID301 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID301 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-344-misl 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-344-mis2 

(B) LOCATION: complement 25.. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 307: 



TGCTGCCAAG GATCCATGTC AGCGTGCTCC TCTCTGAGCC CTGGTCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 308: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-366 
(U) LOCATION: 1..47 

(U) OTHER INFORMATION: variant version of HEQ IDJ02 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; t in SEQ ID302 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-366-misl 

(B) LOCATION: 5. . 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-366-mis2 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 308: 



AGGGCCTGGC TTCAGGGACA GCTCAGGAAA TGTTTGTTGA GTTAGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 309: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-359 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID303 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a; g in SEQ ID303 
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(ix) FEATURE: 

(A) NAME/ KEY : Potential microsequencing oligo 99-359-misl 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-359-mis2 

(R) LOCATION: complement 25.. 43 

(x.i) SEQUENCE DESCRIPTION: SEQ ID NO: 309: 



CTACACIAGTC ATCGCCTCCA TCCAGTCTCA ACAAATCCTC CCAGCTC 4? 



(2) INFORMATION FOR 5EQ ID NO: 310: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY : LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-355 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID304 

( ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a; g in SEQ ID304 
(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo 99-355-misl 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo' 99-355-mis2 

(B) LOCATION: complement 25.. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 310: 



GGAGTTTCGG GGAGTTTCGG GAGAGTTCCT GGGAAGAAGC TCCTCCC 4 7 



(2) INFORMATION FOR SEQ ID NO: 311: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: polymorphic fragment 99-365 

(B) LOCATION: 1..4H 

(D) OTHER INFORMATION: variant version of SEQ ID30S 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID305 

(ix) FEATURE: 

(A) NAME /KEY : microsequencing oligo 99-365-nisl 
<B) LOCATION: 5.. 23 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oli.:o 99-365-mis2 

(B) LOCATION: complement 25.. 48 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 311: 
CCTACCAAGC AAGCAGCCCC AGCTTAGGGT CAGACAGGGT GAGCCTC 



(2) INFORMATION FOR SEQ ID NO: 312: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic fragment 99-2452 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID306 

(ix) FEATURE: 

(A) NAME /KEY : polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID306 
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(ix) FEATURE: 

(A) NAME/E<EY: microsequencing oligo 99-24 52-misl 

(B) LOCATION : 5.. 2 3 

(ix) FEATURE: 

(A) NAME /KEY : Potential microsequencing oligo ?)9-24 52-mis2 

(B) LOCATION: complement 25.. 47 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 312: 
TGGGCGCGGA CATGGAGGAC GTGTGCCGCC GCCTGGTGCA GTACCGC: 4 7 



(2) INFORMATION FOR SEQ ID NO: 313: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer tor SEQ ID301 and SE«J 

ID307 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 313: 

GCTCTCATAT TCATTGGGTG 20 



(2) INFORMATION FOR SEQ ID NO: 314: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE : 

(A) NAME/KEY: upstream amplification primer for SEQ ID302 and SEQ 

(B) LOCATION: 1..18 



(ix) 

ID308 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 314: 
TCTCTCCCGT GTTAAATG 10 

i'A) CN FORMATION FOR SEQ ID NO: 315: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(U) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : upstream amplification orirner tor SEQ ID303 and SEO 

ID30 9 

(D) LOCATION: 1 . . 18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 315: 

AATCTTCTTG CTCCTGTC la 



(2) INFORMATION FOR SEQ ID NO: 316: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

ID310 NAME/KEY: upstream amplification primer for SEQ ID304 and SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 316: 

AGGTTAGGGG TGTATTTC 18 
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(2) INFORMATION FOR SEQ ID NO: 317: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

( B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE : DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID311 



(i:<) FEATURE: 

(A) NAME /KEY : upstream amplification primer for SEO ID305 and SEQ 
(D) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 317: 

AGACTGTGAC CTTAGACC 13 



(2) INFORMATION FOR SEQ ID NO: 318: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs ^ 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID306 and SEQ 

ID312 

(B) LOCATION: 1..18 

(D) OTHER INFORMATION: Extracted from sequence gb:M10065 

(3791. .3808) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 318: 



GACGAGACCA TGAAGGAG 18 



(2) INFORMATION FOR SEQ ID NO: 319: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo napicns 

(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID301 and 

flEO ID307 

( B) LOCATION: I.. 19 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 319: 

TGGCTGCGGT TAGATGCTC 19 



(2) INFORMATION FOR SEQ ID NO: 320: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: ,18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLQGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : downstream amplification prirr.or for SEQ ID302 and 

SEQ ID308 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 320: 

AGGGGTAACT CTTGATTG 18 



(2) INFORMATION FOR SEQ ID NO: 321: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/ KEY : downstream amplification primer for SEQ ID303 and 

SEQ ID30H 

(D) LOCATION: 1..18 
(:<i) SEQUENCE DESCRIPTION : SEQ ID NO: 321: 

ACCAAGCCAT AGCTTCTC j H 

(2) INFORMATION FOR SEQ ID NO: 322: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base oairs 

(B) TYPE: NUCLEIC ACID 
£C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primor for SEQ ID304 and 

SEQ ID310 

(B) LOCATION: 1..18 

<>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 322: 

ATACAGCCAG GGAGATAG 18 

(2) INFORMATION FOR SEQ ID NO: 323: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo ■ sapiens 

(ix) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID305 and 



SEQ ID311 



(B) LOCATION: 1..18 



3 DOC ID: <WO 



_9904O38A2J_> 



WO 99/04038 



PCT/IB98/01193 



155 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 323: 
AATTGCTACC CCCAATTC 



(2) INFORMATION FOR SEQ ID NO: 324: 

(L) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
(U) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i>:) FEATURE: 

(A) NAME /KEY : downstream amplification primer for SEQ ID306 and 

SEQ ID312 

(B) LOCATION: 1 . . 18 

(D) OTHER INFORMATION : Extracted from sequence gb:M10065 
(complement 4378.. 4395) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 324: 
TCGAACCAGC TCTTGAGG i« 



(2) INFORMATION FOR SEQ ID NO: 325: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-344. misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 325: 
TGCTGCCAAG GATCCATGTC AGC 
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(2) INFORMATION FOR SEQ ID NO: 326: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: rnicrosequencing oligo 99-366. misl 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 326: 



CCTGGCTTCA GGGACAGCT 



(2) INFORMATION FOR SEQ ID NO: 327: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potential rnicrosequencing oligo 99-359. misl 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 327: 



CTACAGAGTC ATCGCCTCCA TCC 



(2) INFORMATION FOR SEQ ID NO: 328: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : potontinl microscquoncing olicjo c ) c )-3. r >fj . ml:; I 

(B) LOCATION: 1..23 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 328: 

c;c;agtttcc;c ggactttcgg gag 2 3 



(2) INFORMATION FOR SEQ ID NO: 329: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
<C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencinrj oliqo l )0- 3(">f> . mis 1 

(B) LOCATION: 1 . . 19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 9: 
CCAAGCAAGC AGCCCCAGC 19 



(2) INFORMATION FOR SEQ ID NO: 330: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: microsequencing oligo 99-24 52. misl 

(B) LOCATION: 1. .19 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 330: 
CGCGGACATG GAGGACGTG 19 



(2) INFORMATION TOR SEQ ID NO: 331: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-344 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 331: 
CAGGGCTCAG AGAGGAGCA 19 



(2) INFORMATION FOR SEQ ID NO: 332: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: potential microsequencing oligo 99-366. mis2 

(B) LOCATION: 1..23 

1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 332: 
CACTAACTCA ACAAACATTT CCT 23 



(2) INFORMATION FOR SEQ ID NO: 333: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE : DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : microsequcncing oligo 99-359. mis2 

(B) LOCATION: 1..19 

(>:i) SEQUENCE DESCRI PTION :f SEQ ID NO: 333: 



TGCCAGGATT TGTTGAGAC 



(2) INFORMATION FOR SEQ ID NO: 334: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-355. mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 334: 



GGAGCTTCTT CCCAGGAAC 



(2) INFORMATION FOR SEQ ID NO: 335: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) 



ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME /KEY : potential microsequencing oligo 99-365. mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 335: 
GAGGCTCACC CTGTCTGACC CTA 23 



(2) INFORMATION FOR SEQ ID NO: 336: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-24 52 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 336: 
GCGGTACTGC ACCAGGCGGC CGC 2 3 
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1: [T| Claims Nos.: HI 

because they relate to subject matter not required to be searched by this Authority namely 

Remark: Although claim 111 

tL? ir tZ ted t0 u u 6th ? d of treatn >ent of the human/animal 

«/fu rCh h " C3rried out and based on the alleged 

effects of the compound/composition. 

2. | | Claims Nos.: 

an^S^^ 



f I Claims Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Ru.e 6.4(a). 

Box II Observations where unity of invention is lacking (Continuat ion of item 2 of firstshertT 

This International Searching Authority found multiple inventions in this international application, as follows: 
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127,129 (partial); 1-126,131,138 (complete) 
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FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



1. Claims: 127,129 (partial); 1-126,131,138 (complete) 

An isolated nucleic acid comprising a bi allelic marker 
sequence selected from the group of SEQ ID NO: 301, SEQ ID 
N0:307, their complementary sequences, or fragments 
comprising at least 8 consecutive nucleotides including the 
biallelic marker site, 

a set of nucleic acids including such biallelic markers, 
methods of obtaining such a set, arrays of nucleic acids 
comprising such a set, a map conprising such an array, 
methods of identifying biallelic markers associated with a 
detectable trait or an individual's risk of developing such 
a trait, methods of identifying a gene or a haplotype 
associated with such a trait, a method of selecting an 
individual for a treatment, a method of treatment of such an 
individual, and a method of determining an individual's risk 
of developing or possessing Alzheimer's disease. 

2. Claims: 127,129 (partial); 132 (complete) 

An isolated nucleic acid comprising a biallelic marker 
sequence selected from the group of SEQ ID NO: 302, SEQ ID 
N0:308, their complementary sequences, or fragments 
conprising at least 8 consecutive nucleotides including the 
biallelic marker site, and a method of determining an 
individual's risk of developing or possessing Alzheimer's 
disease. 

3. Claims: 127,129 (partial); 133 (complete) 

An isolated nucleic acid comprising a biallelic marker 
sequence selected from the group of SEQ ID NO:303, SEQ ID 
NO: 309, their complementary sequences, or fragments 
comprising at least 8 consecutive nucleotides including the 
biallelic marker site, and a method of determining an 
individual's risk of developing or possessing Alzheimer's 
disease. 



4. Claims: 127,129 (partial); 134 (complete) 

An isolated nucleic acid comprising a biallelic marker 
sequence selected from the group of SEQ ID NO: 304, SEQ ID 
NO:310, their complementary sequences, or fragments 
comprising at least 8 consecutive nucleotides including the 
biallelic marker site, and a method of determining an 
individual's risk of developing or possessing Alzheimer's 
disease. 

5. Claims: 127,129 (partial); 135 (complete) 

An isolated nucleic acid comprising a biallelic marker 
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FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



sequence selected from the group of SEQ ID NO: 305, SEQ ID * 
NO: 311, their complementary sequences, or fragments 
comprising at least 8 consecutive nucleotides including the 
biallelic marker site, and a method of determining an 
individual's risk of developing or possessing Alzheimer's 
disease. 



6. Claims: 127 (partial); 128,130 (complete) 

An isolated nucleic acid comprising a biallelic marker 
sequence selected from SEQ ID N0:306, its complementary 
sequence, or fragments comprising at least 8 consecutive 
nucleotides including the biallelic marker site, and a 
method of determining an individual's risk of developing or 
possessing Alzheimer's disease. 

7. Claims: 136,139 (complete) 

An isolated nucleic acid primer for amplification selected 
from the group of SEQ ID Nos:313-317 and SEQ ID Nos:319-323, 
their complementary sequences, or fragments comprising at 
least 8 consecutive nucleotides, and a set of nucleic acids 
comprising such a primer. 

8. Claims: 137,140 (complete) 

An isolated nucleic acid primer for microsequencing selected 
from the group of SEQ ID Nos:325-329 and SEQ ID Nos:331-335, 
their complementary sequences, or fragments comprising at 
least 8 consecutive nucleotides, and a set of nucleic acids 
comprising such a primer. 
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